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Abstract. This paper is a part of our programme to generalise the Hardy-Littlewood 
method to handle systems of linear questions in primes. This programme is laid out in 
our paper Linear equations in primes |14] . In particular, the results of this paper may 
. be used, together with the machinery of [M] , to establish an asymptotic for the number 

I of four-term progressions pi < P2 < Ps < Pa ^ N oi primes, and more generally any 

problem counting prime points inside a "non-degenerate" affine lattice of codimension 
^/^ I at most 2. 

. The main result of this paper is a proof of the Mobius and Nilsequences Conjecture 

I for 1 and 2-step nilsequences. This conjecture is introduced in [M] and amounts to 

showing that if G/T is an s-step nilmanifold, s ^ 2, if F : G/F ^ [— 1, 1] is a Lipschitz 
^ function, and if Tg : G/T — > G/T is the action oi g E G on G/T, then 

N-'Y. ■ ^) «A.G/r \\F\\up log-^ N 

n<N 



■3 



uniformly in g G G and x G G/T, for any A > 0. This can be viewed as a "quadratic" 
generalisation of an exponential sum estimate of Davenport |7j, and is proven by 
following the methods of Vinogradov and Vaughan. 
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^ ! 1. Introduction 

O 

\o 

^ ■ The Mobius function /i : N — {—1, 0, +1}, defined by 

^ ■ 

c3 ■ ( (—1)*' if n = P1P2 ■ ■ - Pk for distinct primes pi, . . . ,pk 

S . /i(n) := < if n is not squarefree 

> : ( 1, if n = 1 

^ \ plays a fundamental role in analytic number theory, especially with regard to the 

I distribution of primes. A well-known metaprinciple holds that fluctuates so "ran- 

domly" that it is asymptotically orthogonal to any "low complexity" bounded sequence 
/ : N — > C. We do not have a formal definition of "low complexity" , but the examples of 
this section should convey the general flavour. Functions which arise from geometry and 
algebra, such as characters n ^ e(na), are certainly of low complexity, whereas func- 
tions which depend on the prime factorisation of n, such as /i itself, the von Mangoldt 
function A, and certain divisor sums arising in sieve theory, are not. 
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In our first example, and throughout the paper, we will use the following notation. 
We write [A^] := {1, ...,iV} to denote the integers from 1 to A^, and E„g^/(r;,) : = 
\X\ SneA f(^) denote the average of a function / : A C on a non-empty finite set 
A. We also use X <^ Y ot X = 0{Y) to denote the claim that |X| ^ CY for some 
absolute constant C > 0. 

Example 1 (/x is strongly orthogonal to the constant function) . We have 

E„e[;v]/iH « e-^v^ (1.1) 
for all > 1 and some absolute constant c > 0. 

Remark. This is essentially equivalent to the prime number theorem with the classical 
error term of Hadamard and de la Vallee Poussin. 



In the next example, and throughout the paper, we use X Y or X = Oa(Y) to 
denote the claim that |X| ^ C^Y for some constant Ca > depending on A. 

Example 2 (/i is strongly orthogonal to Dirichlet characters) . For any A > we have 

E„e[^]/i(n)^ <^ j^^-A ^ ^^ 2) 

for all and all Dirichlet characters x to modulus q. 

Remark. See for instance (TUl Corollary 5.29]. This may be used to prove the Siegel- 
Walfisz theorem concerning the distribution of primes in arithmetic progressions. 

The form of the bound in (11.21) may appear strange at first sight. A key point to 
appreciate is that the implied constant C = Ca is ineffective, due to the possible 
existence of Landau-Siegel zeros. The book [8] may be consulted for further information. 
It is useful to have a name for bounds of this kind. 

Definition 1.1 (Strong asymptotic orthogonality). If / : N ^ C and (7 : N ^ C are 

two sequences on the natural numbers N = {1, 2, 3, . . .}, we say that / and g are strongly 
asymptotically orthogonal if we have the estimate 

f{n)g{n)<^A log'^iV 

for all A^ > 1 and all A > 0. We allow the implied constant Ca to be ineffective, in that 
we may have no explicit bounds on Ca other than that it is finite. 

Thus Example [2] shows that /x is strongly asymptotically orthogonal to all Dirichlet char- 
acters, and some Fourier analysis then shows that it is in fact strongly asymptotically 
orthogonal to any periodic sequence. In fact, more is true, as we shall see in the next 
example. Here, and throughout the paper, we use e() to denote the standard character 
e(x) := exp(27rzx). 

Example 3 (/i is strongly orthogonal to linear phases). For any a G M/Z and for any 
A > 0, we have 

E„g[7v]/i(n)e(-an) <a log'^AT, (1.3) 

uniformly in a G M/Z. 
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This bound is due to Davenport [7j and can be deduced from fll.2p by an application of 
Vinogradov's version of the Hardy-Littlewood major/minor arc decomposition of M/Z. 
See, for example, [IHl Theorem 13.10]. For pedagogical reasons, and because we need 
this result for later sections, we give the derivation in ^ Davenport's result may be 
used on its own to obtain a number of self-correlation estimates on /z. For instance, by 
combining (11.31) with elementary Fourier analysis (the circle method) we easily obtain 
the estimates 

E^,d^[N]Kx)n{x + d)ij{x + 2d) <^ log-^ N (1.4) 

and 

^xM,h2elN]fJ'{x)fi{x + hi)fi{x + h2)fi{x + hi + h2) <A log~^ A^. (1.5) 

Similar expressions in which /x is replaced by A, the von Mangoldt function, may be 
analysed using (11.31) as a key ingredient. The answers have a more complicated form 
involving a main term which is a product of local factors or singular series. See [HI 
§13] and [T^ for different approaches to thi£|. 

A full discussion of results such as (11.41) . (11.51) and the corresponding results for A is 
given in [T3]. For comparison with that paper, we remark that the two systems of linear 
forms in (11.41) and (II. 5p . namely (x, x + c?, x + 2d) and (x, x + /ii, x + /12, x + /ii + /i2), 
both have complexity equal to one. This notion of complexity 1 essentially marks the 
limit of the classical Hardy-Littlewood circle method. The main goal of this paper is to 
provide some of the technical machinery needed to address the case of complexity 2. 

We can reformulate (II. 3p in a manner which may appear strange at first, but is well 
suited togeneralisations, as we shall soon see. If X is any metric space, define a Lipschitz 
functiorn on X to be any function / : X — >■ C whose (inhomogeneous) Lipschitz norm 

ll/lkip := sup |/(x)| + sup — — 

xeX x,yeX:x=/=y "'\X, y ) 

is finite. 

Example 4 (yU is strongly orthogonal to 1-step nilsequences) . Suppose that G is a con- 
nected, simply-connected abelian Lie group (written multiphcatively) with a smooth 
metric d, and that F is a closed subgroup of G which is cocompact. Then G/T is called 



While the von Mangoldt function A is more directly related to the primes, the Mobius function 
H is somewhat easier to handle analytically, being bounded by 1 and not encountering the "local" 
irregularities in small residue classes that A faces; in particular, the "major arc" terms will have a 
significantly simpler form. Also, the Vaughan identity for /i is slightly cleaner than that for A (see 
Lemma l4.1|) . Thus in this series of papers we have adopted a "Mobius first" philosophy, in which we 
obtain estimates on the Mobius function fi using "hard" analytic tools, and then use "softer" techniques 
to transfer the bounds on fi to the bounds on A. 

^The Lipschitz class is a convenient regularity class for us to use; it is smooth enough that one 
approximate uniformly and quantitatively by trigonometric series (see Lemma IA.9I) , yet rough enough 
that one can easily extend a function in this class from a small domain to a larger domain (see Lemma 
lA.Sp . Also, the Lipschitz class is meaningful in both discrete and continuous settings. Of course, the 
results of this paper also hold in smoother classes such as C°° , and qualitative versions of these results 
(with decay factors such as log"'^ N replaced by o(l)) hold for rougher classes such as the continuous 
class C°, or even piecewise continuous classes, by standard limiting arguments. 
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a 1-step nilmanifold] it is a torus. Let F : G/T — > C be a Lipschitz function, and let 
Tg : G /r —>■ G /r denote the action of g on G/T. Then we have the estimate 

En^lN]fi{n)F{T^x) <A,G/r ||F||Lip log"^ AT (1.6) 
for all > 1, uniformly in g E G and x G G/T. 

The sequence n F{TgX) is called a 1-step nilsequence. If we specialize to the circle 
nilflow case 

then G/r is isomorphic to the unit circle M/Z, and if we identify a real number a with 
the group element (of), then : M/Z — > R/Z is just the shift x ^ x + a(mod 1). 
Using the standard character e : R/Z — >■ C as the Lipschitz function F, one then sees 
that flL3l) is a special case of (11. 6p . In fact, the two examples are more-or-less equivalent, 
as we shall see in ^ where (11.61) will be established. 

The main aim of this paper is to generalise (II. 6p to cover 2-step nilsequences. In the 
companion paper [H] to this paper, we shall show how such estimates can be used to 
prove various "complexity 2" estimates for the Mobius and von Mangoldt functions. 

Before stating our main result, we give the definition of s-step nilsequences in general, 
followed by some examples. 

Definition 1.2 (Nilmanifolds and nilsequences). Let G be a connected, simply con- 
nected. Lie group. We define the central series Gq ^ Gi ^ G2 ^ ■ ■ ■ by defining 
Go = Gi = G, and Gj+i = [G, Gi] for i ^ 2, where the commutator group [G, Gi] is the 
group generated by {ghg^^h~^^ : g E G,h E Gi}. We say that G is s-step nilpotent if 
Gs+i = 1. Let F C G be a discrete, cocompact subgroup. Then the quotient G/T is 
called an s-step nilmanifold. U g E G then g acts on G/T by left multiplication, x ^ gx. 
By a (basic) s-step nilsequence, we mean a sequence of the form {F{Tg ■ x))nm, where 
X G G/T is a point, F : G/T — > C is a continuous function and Tg : G/T —>■ G/T is 
left multiplication by g. We say that the nilsequence is bounded ii \F\ takes values in 
[—1,1]. We may (arbitrarily) endow G/T with a smooth Riemannian metric (ic/r- If 
the function F is Lipschitz with respect to this metric, we shall refer to the nilsequence 
{F{Tg ■ x))„gN as Lipschitz. 

Remark. In this paper we will usually suppress explicit mention of the metric dc/r- 
Whenever an estimate is said to depend on a nilmanifold G/T, it should be assumed 
that it also depends on the choice of metric. See [H] for a more detailed discussion. 

Clearly every 1-step nilsequence is a 2-step nilsequence. The next simplest example of 
nilsequences are quadratic phases. 



QUADRATIC UNIFORMITY OF THE MOBIUS FUNCTION 5 

Example 5 (The Heisenberg nilflow, I). Consider the exampl^ 

„ /1RIR\ _ / 1 Z 1\ 

G := 1 K : r := 1 z . 
Vooi/ Vooi/ 

Then G/T is a 2-step nilmanifold. Apart from a set of zero measure, G/T may be 
identified with the fundamental domain 



{(oij) :-l/2<x,2/,z^l/2} 



using the easily- verified fact that 



flxy\ f l{x} {y-x[z]}\ 

12=01 {z} (mod f j. 
Vooi/ 1 J 



Here, {x} refers to the fractional part of x lying in the interval (—1/2, 1/2] and [x] 
X — {x}. Writing 



1 -e ~e 

12 
1 



where 6' G M, one may check that 



9 = ( 1 (mod r). 
Vo 1 / 

Thus we see how functions with "quadratic" behaviour arise from 2-step nilsequences. 
The rather natural function e{n^9) does not quite arise as a Lipschitz nilsequence on 
the 3x3 Heisenberg group, since the function 



1 2 ) e{y) 

1/ 



on T does not extend to a continuous function on G /T . The situation may be remedied 
by splitting e{n'^6) as the sum of (say) 10 functions x({^^})e(^^6') where x is a Lipschitz 
cutoff supported on an interval of width 1/5. Each of the 100 functions 



III) X(a;)x'(^)e(l/) 



does extend to a Lipschitz function on G/T. By taking products one may realise e{n'^l 
as a Lipschitz nilsequence on the 2-step nilmanifold {G/TY^^. 



In view of the previous example and our general intent in this paper, it is natural to 
ask for the estimate 

E„e[jv]Ai(n)e(-an2 - /?n - 7) <a log"^ N, (1.7) 

with an implied constant independent of a, 13 and 7. We will prove such an estimate 
in ^ Like (11.31) . this bound is a fairly standard application of Vinogradov's version 
of the Hardy-Littlewood method, though somewhat more complicated due to the need 
to estimate quadratic exponential sums rather than just linear exponential sums. The 
proof of it has much in common with techniques pioneered by Hua [12] and Vinogradov 
[26] in connection with the Goldbach- Waring problem. It should be thought of as a 
warm up for the main business of the paper. 

"'For more detail on the Heisenberg nilflow, Appendix [B] may be consulted. One can also generate 
quadratic phase sequences such as e{n'^d) using the slightly simpler skew shift nilflow (see e.g. [T3l 
Example 12.3]), but we shall refrain ft-om doing so here as the underlying Lie group is disconnected 
and thus does not quite fall within the framework of Deflnition 11.21 
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As we have already mentioned, in ^Hl we shall see that orthogonality to linear phases is 
more-or-less equivalent to orthogonality to 1-step nilsequences. However, orthogonality 
to quadratic phases is significantly weaker than orthogonality to 2-step nilsequences. 
This is because there are examples of 2-step nilsequences which do not look much like 
quadratic phases. 

Example 6 (The Heisenberg flow, II). We repeat the analysis of the previous example, 
but with a less restrictive choice of g. Take 



1 a 13 
1 7 
1 



A simple induction confirms that 



^ / 1 X y\ 11 x+na j/+n/3+|n(n+l)o 
(? ■ ( 1 z ) = I 1 z+nj 
V 1 



1 



When reduced to lie in the fundamental domain JF, one can end up with functions 
taking the form [?T,a]?T,7 (and related forms). These functions are known as generalised 
quadratics, and they capture the spirit of 2-step nilsequences much more completely 
than genuine quadratic functions do. By repeating the tricks mentioned in the previous 
example one may actually approximate e{—[n\/2]n\^) (say) outside of sets of arbitrar- 
ily small density as a Lipschitz nilsequence on some product of several copies of the 
Heisenberg example. 



The previous two examples give some idea of what a 2-step nilsequence looks like. 
Our main result in this paper is that the Mobius function is strongly asymptotically 
orthogonal to all such functions. This estimate is the case s = 2 of the Mobius and 
Nilsequences Conjecture MN(s): see [HI §6] for further discussion. 

Main Theorem (MN(2) conjecture). Suppose that G/T is a 2-step nilmanifold, and 
that F : G/T ^ C is a Lipschitz function. Then for every A > we have the estimate 

En^[N]t^{n)F{T^x) <^A,G/T \\F\\up log"^ N (1.8) 
uniformly in g E G and x G G/T. 

Remark. We conjecture that MN(s) holds for arbitrary s, that is to say there is an 
analogue of the Main Theorem for s-step nilmanifolds for any s ^ 1. The fact that the 
bound (11. 8p is uniform in x is unsurprising (since G/T is compact), as is the uniformity 
among all F with fixed Lipschitz norm (thanks to the Arzela-Ascoli theorem). The 
uniformity in g is less trivial, and is quite important for applications. 

We shall prove the Main Theorem as a consequence of a similar result. Theorem 12.21 
below, in which the notion of a 2-step nilsequence is replaced by a more technical type 
of sequence (a 1-step nilsequence twisted by a locally quadratic phase) that is more 
tractable for analysis. The proof of Theorem 12.21 is by far the most difficult portion of 
the paper and will occupy §121 In comparison, the deduction of the Main Theorem 
from Theorem 12.21 is more standard and is performed in ^ and Appendix [Bl 



The estimate (11. 7p . as well as estimates for generalised quadratic phases such as 

En(z[N]fi'{n)e{-[nV2]nV3) = o(l). 
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are consequences of our main theorem. 

Remark. The main result of this paper can then be combined with the Gowers Inverse 
Theorem from [13] to obtain a number of new correlation estimates for the Mobius 
function, such as 

Ea;,de[Ar]/i(x)/i(x + d)^{x + 2d)fi{x + 3d) = OAr_oo(l) 

and 

Ex,hi,h2,h3e[7V]Ai(a;)/i(a; + hi)fi{x + h2)^i{x + /13) 

+ hi + h2)fJ,{x + hi + hs)fi{x + /12 + hs)fi{x + hi + h2 + h^) = ON^ooi^) 

(compare with (11 .4^ . (ll.Sp ). It can also be used (with some additional effort) to establish 
an asymptotic for expressions such as 

E^,de[Ar]A(a;)A(a; + d)A{x + 2d)A{x + 3d) 

as — >■ cx), thus enabling one to count the quadruples of number of primes Pi < P2 < 
Ps < Pa ^ N in arithmetic progression up to a fixed level A^. We defer all of these 
applications to the companion paper [14j . 



2. A TECHNICAL REDUCTION 

In this section we present a technical counterpart of the Main Theorem, namely Theorem 
12.21 below, in which the 2-step nilsequence is replaced by a more analytically tractable 
object, namely a 1-step nilsequence twisted by a locally quadratic phase. We then 
discuss how this result implies the Main Theorem. The proof of Theorem 12.21 will then 
occupy the rest of the paper (except for the Appendices). We first need some notation. 

Definition 2.1 (Locally polynomial phases). Let 5 C Z be a set of integers, and let 
d ^ 0. A phase function : S* — > M/Z is said to be locally degree d on S if whenever 
n,hi, . . . , hd+i are such that the 2^^+^ quantities n + ei/ii + . . . + ed+ih^+i, e-j G {0, 1} lie 
in the set S, we have 

J2 (_i)^i+-+^.+i0(^ + ei/ii + ... + e^+i/i,+i) =0. (2.1) 
ee{o,i}'*+i 

We refer to phases of local degree 1 as locally linear, phases of local degree 2 as locally 
quadratic, and so forth. 

Examples 1. Constant phases have local degree 0, while linear phases (f){n) := an for 
a G M have local degree 1. If 7 are real numbers, then the phase 0(n) := an"^ + 
Pn + 7(mod 1) is globally quadratic (i.e. quadratic on all of Z). The phase 0(n) : = 
{an} {Pn}'y (mod 1) is not globally quadratic, but it is locally quadratic on the Bohr set 
S := {n E Z, : \{an}\, \{Pn}\ ^ 0.1}, which is a set of positive density in Z. The phase 
4>{n) := {an}'~f{mod 1) is locally linear on the same set. 

Theorem 2.2 (/i is strongly orthogonal to local quadratics). Let G/T be a 1-step 
nilmanifold, let F : G/T —>■ C be a Lipschitz function, and let g E G and x G G/T be 
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arbitrary. Let cj) : Bj~^ —>■ M/Z be a phase which is locally quadratic on the Bohr setQ 
Bn := {n e [N] : F{T^x) ^ 0}. Then we have 

EneiN]fi{n)F(jfx)e{-(P{n)) <G/r,A ||F||Lip log"^ iV. 

The proof of Theorem 12.21 is rather lengthy. Let us assume it for now and deduce the 
Main Theorem. The main proposition in achieving this deduction is 

Proposition 2.3 (2-step nilsequences as averages of twisted 1-step nilsequences) . Let 
G/r be a 2-step nilmanifold and let < e < 1/2. Let F : G/T C be a Lipschitz 
function with ||-F||Lip ^ 1? (^nd let g & G and x G G/V be arbitrary. Then there exists a 
1-step nilmanifold G/T depending only on G/T and a decomposition 

F{T^x) = E,^jw,F,{T;^x,)e{-Mn)) + 0{e) (2.2) 

where 



• I is a finite index set; 

• For each i E I the Wi are complex numbers with Ejg/|wj| <^ £~Oa/rm ■ 

• Fi : G/T — > C ^s Lipschitz with norm Oc/ri^)! 

• giE Gj ^ 

• X, G G/f; 

• (pi : Bi ^ M./'Z is a phase function which is locally quadratic on the generahzed 
Bohr set B, := {n G [A^] : Fi{T^^Xi) ^ 0}. 

We have a proof of a generahsation of this proposition to fc-step nilsequences (they 
are averages of twisted {k — l)-step nilsequences). This proceeds using some rather 
algebraic considerations involving "Hall-Petresco parallelepiped groups" associated to 
the nilmanifold G/T. These considerations are very similar to, but more complicated 
than, the material in [iT, Appendix E]. We anticipate presenting the proof of this result 
in a future paper concerned with the generalisation of the Main Theorem to nilmanifolds 
of arbitrary step. 

In this paper we present a more computational approach involving so called Mal'cev 
bases [6l |18]. This approach is completely explicit when the group G is a product of 

1 R j . The reader will find remarks in [Hj explaining that, in the 

theory of linear systems of complexity 2 (such as four-term APs) only examples of this 
type need be considered. 

The use of bases may seem overly explicit to some, but it should be noted that Mal'cev 
bases are in fact required to prove certain foundational topological properties of nilman- 
ifolds. Those results are needed for the approach, just alluded to, that is taken in [HI 
Appendix E]. 



This definition of a Bolir set is not quite identical to otlier Bolir sets in the literature, for instance 
in [13], but it is very closely related; see the proof of Lemma [11. 41 
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The proof of Proposition 12.31 may be found in Appendix [Bl Assuming it and Theorem 
12.2^ we can now derive the Main Theorem as follows. 

Proof of the Main Theorem assuming Theorem \2.2 and Proposition \2.3\ . Let G/T, F, A 
be as in the Main Theorem. By renormalising we may assume that ||/||Lip ^ 1- We 
apply Proposition 12.31 with e := log" and obtain a decomposition (12. 2p . Taking 
inner products with /i, we obtain 

E„.^[N]f^{n)F{T^x) < Ei^j\w,\Er,e[N]f^{n)Fi{T^^Xi)e{-(Pi{n)) + log-^ N. 

Applying Theorem 12.21 we conclude that 

Ene[N]Kn)F{T^x) <^,,G/f ^iei\wi\ log"^' A^ + log"^ A^ 

for any A'. But Eie/|wj| < (log^ A^)'^G/r(i)^ ^j^^jj^ follows by taking A' suitably 

large. □ 

Remark. Conversely it is also possible to deduce Theorem 12.21 from the Main Theorem 
by obtaining a suitable converse to Proposition [2?3] (cf. the proof of [13, Theorem 12.8]), 
but we will not do so here. 

3. Orthogonality to periodic functions 

We now begin the proof of Theorem 12.21 which is the heart of this paper. (The other 
major component of the paper is the proof of Proposition 12.31 in Appendix [Bl This can 
mostly be read independently of the part of the paper concerned with Theorem 12. 2[ 
though it will utilize the harmonic analysis tools collected in Appendix 1X1) 

Our strategy in proving Theorem 12.21 shall be to establish the strong asymptotic or- 
thogonality of the Mobius function to increasingly large classes of sequences, starting 
with very simple ones and then moving on to "higher degree" sequences. Let us begin 
with some generalities on how one can go about proving that /i is orthogonal to some 
function F. There are essentially two complementary methods for doing this. The first, 
which will feature prominently in this section, is appropriate when F is multiplicative, 
for example F = 1 or F = x, where x is some Dirichlet character to the modulus q. 
Then one may relate En^[]\[]n{n)F{n) via Perron's Formula to zeros of L-functions such 
as ({s) and L{s, x) in the critical strip, the orthogonality coming from the non-existence 
of zeros close to 3fJs = 1. Siegel's theorem, concerning a possible zero near s = 1 when 
X is real, is of particular importance. It implies the bound (II. 2p . which we recall now: 

Proposition 3.1. For any A > we have 

E„e[iv]/i(^)xR <A g'/' log-^ N (3.1) 
for all Dirichlet characters x to modulus q. 

Remark. For the proof, see [Ml Prop. 5.29]. As noted in [16j p. 124] there are difficulties 
involved in applying the standard Perron's formula approach to E„g[jv]/i(n)x(n) directly, 
and it is rather easier to first obtain bounds on E„g[jv]A(n)x('^)- Note that the bound 
is only non-trivial when the period q is very small (e.g. Oilog"^ N)) compared to A^. If 
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one assumed GRH then one could improve the logarithmic decay here to a polynomial 
decay, which would of course lead to improvements in the other bounds in this paper. 

As we will see later in this section, the need to consider zeros of L-functions also appears 
when dealing with functions F which are not quite multiplicative. For example, they 
must play a role in the case F{n) = e{an/q), since any Dirichlet character to modulus 
g is a linear combination of a few such functions F. 

At the other end of the spectrum one has functions F which are far from multiplicative, 
such as F{n) = e{n\^). For these functions a completely different method, due origi- 
nally to Vinogradov, may be brought to bear. The sum KneiN]fJ'{n)F{n) is decomposed 
into so-called Type I and Type 11 sums, which can be estimated without having to 
understand the oscillation of /x. Provided F is not close to being multiplicative, those 
sums can often be shown to be small by (effective) harmonic analysis methods. We will 
discuss this method, in a modern and very neat incarnation due to Vaughan, in §11 

We now begin the proof of Theorem 12.21 by establishing the asymptotic orthogonality 
of the Mobius function to periodic sequences, which can be viewed in some sense as 
"0-step nilsequences" , and which will be needed to handle the "major arc" case when 
moving on to linear phases. More precisely, we show 

Proposition 3.2 (Mobius is orthogonal to periodic sequences). Let f : N ^ C be a 

sequence bounded in magnitude by 1 which is periodic of some period q ^ 1. Then we 
have 

E„e[Ar]Ai(n)7H <a q log"^ 
for all A > 0, where the implied constant is ineffective. 

Proof. We first establish the estimate under the additional assumption that f{n) van- 
ishes whenever (n, g) 7^ 1. Then / can be viewed as a function on the multiplicative 
group (Z/qZ)^, and thus has a Fourier expansion 

= ^fix)x{n), where f{x) ■=^ne{z/qi.)^ f{n)x{n), 

X 

with X ranging over all the characters on (Z/gZ)^. Applying Proposition 13.11 and the 
triangle inequality, we conclude 

X 

But from Cauchy-Schwarz and Plancherel we have 

E l/WI ^ \f(x)n^' = 0(g)^/^(E„,(zM)x \f{n)\'y/' = O(0(g)V2), 

X X 

where 0(g) := |(Z/gZ)^ | is the Euler totient function. Since 0(g) ^ g, the claim follows. 

Now we consider the general case, in which [n, q) is not necessarily equal to 1 on the 
support of /. Observe that if /i(n) is non-zero, then n is square-free, and we can 
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split n = dm, where d = {n,q) is square-free (so fi'^{d) = 1) and m is coprime to q. 
Furthermore we have fj^{n) = fi{d)fi{m). We thus obtain the decomposition 

Ene[N]fi{n)f{n) = ^ ^ fi{d) ^ fi{m)f{dm)l(^rn,q)=i- (3.2) 

d\q;fi'^{d)=l l<:m-^N/d 

The sequence m ^ f{dm)l(^rn,q)=i is periodic of period q/d and vanishes whenever 
{m,q/d) 7^ 1, hence by the preceding arguments 

J2 yu(m)/(rfm) <^ ^ log-^ N. 
Thus from fl3.2p we have 

dig 

concluding the proof of Proposition I3.2[ □ 



4. VAUGHAN'S IDENTITY 



In this section we discuss Vinogradov's method for proving that the Mobius function 
fi is orthogonal to a function F : N —>■ C As we remarked in ^ this involves a 
decomposition of E„g[Ar]/x(n)F(n) into Type I and Type II sums. The first argument 
of this type was due to Vinogradov (who worked with the von Mangoldt function A 
instead of fi). We will use a particularly simple identity due to Vaughan |23j to effect 
our decomposition into Type I and II sums. See [151 Chapter 13] for a nice discussion 
of this and related identities. 

Let us begin with a few preliminary remarks on our strategy for dealing with Vino- 
gradov's method throughout the paper. The normal method for proving Davenport's 
estimate (11. 3p would be to divide all a G M/Z into two classes: the major arcs, where 
a ^ a/q for some reasonably small q, and the minor arcs, consisting of all other a. If 
a lies in a major arc then one would use Proposition 13.21 to estimate E„g[7v]/^(^)e(a?T,). 
If, by contrast, a lies in a minor arc then one would establish that Type I and II sums 
involving /(n) = e{an) are small (see below for an explanation of what this means). 
Vaughan [211 Chapter 3] may be consulted for details. 

We will adopt what we call an "inverse" strategy. In ^ we will provide a proof of 
Davenport's estimate. There we will assume that either a Type I or a Type II sum 
involving f{n) = e{an) is large, and then deduce that a lies in a major arc. The 
distinction between our argument and the standard one may seem rather unimportant, 
and indeed the two proofs are logically equivalent. However when it comes to dealing 
with more complicated functions /, such as locally quadratic phases which arise from 
the consideration of 2-step nilsequences, the inverse strategy is very helpful. There it is 
much less obvious what one should mean by a "major arc", and even once the definition 
is made it is not obvious how to handle it in the context of Type I and II sums. 
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In light of Lemma rA.7l it suffices to establish decay estimates for KN<n^2NfJ'{n)f{n). 
The next lemma gives Vaughan's decomposition of sums of this kind. 

Lemma 4.1 (Vaughan's identity). Let U,V,N be positive integers with UV ^ A^, and 
f : N ^ C be a sequence. Then we have 

EAf<n«:27v/iH/(^) = -Ti + Til (4.1) 
where Ti is the Type I expression 

l^d^UV N/d<w^2N/d 

in which 

bc=d:b<iU,c^V 

and Til is the Type II expression 

T„:=l Yl E t^HbdJW) (4.3) 

V<d^2N/U ma^{U,N/d)<w^2N/d 

in which 

bd ■■= Yl 

c\d:c>V 

Remark. One of the key points in the analysis of Type I sums is that the precise form 
of the coefficients a^ is almost completely irrelevant: we will apply the Cauchy-Schwarz 
inequality, and so only the mean square size of these coefficients will concern us. The 
same is true in the analysis of Type II sums. In this case it is the coefficients fi{w) and 
bd which get removed by the Cauchy-Schwarz inequality. 

There is considerable flexibility in the choice of the parameters U and V. We will take 
U = V = 

in our applications. 

Proof. We follow [16, §13.4 - 5]. Observe that for any positive integer n we have 

K^) = E ^(^)^(^)- 

b,c:bc\n 

We split the range of the sum over b, c into four ranges: (i) b ^ U, c ^ V; (ii) b > U, 
c ^ V; (iii) b ^ U, c > V and (iv) b > U, c > V. Denoting the associated sums 
Si, . . . , E4, it is easy to check that S2 = S3 = —Si. It follows that 

fi{n) = -Si + S4 = - ^ /^(&)/^(c) + Y /^(^)/^(c). 

b^U b>U 
c^V ov 
bc\n bc\n 

Multiplying by f{n) and summing over N < n ^ 2N, we have Vaughan's identity: 
^N<n^2NKn)f{n) = -EN<n^2N n{b) (xjc) f {u) + Ejv<^<:2Af fi{b)n{c)f{n) 

b^U b>U 

c^V c>V 

bc\n bc\n 

:= -Ti + Tii. 



QUADRATIC UNIFORMITY OF THE MOBIUS FUNCTION 



13 



It is an easy matter to confirm that Tj may be written in the form fl4.2p . after making 
the substitution d = be and n = dw. One may also check that Tn may be written in 
the form (14.31) after making the substitution w = b and n = dw. □ 



Vaughan's identity tells us that if ^N<n!^2N fJ'{n) f (n) is large then either Tj or Tn is 
large. The next proposition shows how this information is processed, by using the 
Cauchy-Schwarz inequality to eliminate the parameters ad, and leaving behind 

estimates which only involve the explicit function /. 

Proposition 4.2 (Inverse theorem for E,i\i^n^2NfJ'{n)f{n)). Let U,V,N be positive in- 
tegers with UV ^ N, and let f : N ^ C be a function with ||/||oo = 0(1) such that 

\^N<n<i2Nfl'{n)f{n)\ ^ 6 

for some 6 > 0. Then one of the following statements holds: 

• (Type I sum is large) There exists an integer 1 ^ D ^ UV such that 

\^N/d<n.^2N/df{dw)\ » 6l0g-'/'N (4.4) 

for log~^ integers d such that D < d ^ 2D. 

• (Type II sum is large) There exist integers D,W with V/2 ^ D ^ AN/U and 
N/4: ^ DW ^ m, such that 

^2wf{dw)f{d'w)f{dw')f{d'w')\ » 5Hog-^'N. (4.5) 

Remark. The estimate (14. 4p may be viewed as an assertion that / behaves periodically, 
while (14.51) is an assertion that / behaves multiplicatively. The numerical exponents 
could probably be improved slightly here, but we will not need such refinements here 
(especially since our bounds will eventually become ineffective anyway). 



Proof. We may of course take N to be large. Applying Lemma 14. 1^ we see that either 
|Ti| ^ 5/2 or |Tn| ^ 5/2. 

Suppose first that the Type I expression is large, that is to say |Ti| ^ 5/2 where Ti is 
given by (14. 2p . Using the crude bound \ad\ ^ T(<i), where r((i) := 'Yl,b\d^ divisor 
function, we have 

d 

l^d^UV 

By Cauchy-Schwarz inequality this implies that 



~~r^ \^N/d<w<:2N/df{dw) \ > 5. 



J2 ]l\^N/d<w^2N/dfidw)\^:^5^{ J2 ^-^) 

li^d^UV l^di^UV 

Invoking the divisor moment estimate fIC.ip . it follows that 

^ ^ \EN/d<:w^2N/df{dw)\'^ > 5^ log"^ A^. 

l<d<UV 
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Dividing the region 1 ^ ci ^ UV into dyadic blocks D < d ^ 2D (allowing for some 
slight overlap) and applying the pigeonhole principle we obtain 

D<d^2D 

for some D, 1 ^ D ^ UV. Since the summand is bounded by 0(1), a simple averaging 
argument confirms that \^N/d<w^2N/df{dw) | ^ 51og~^^^ for at least 3> S~^D log~^ N 
values of d, which is what we wanted to prove. 

Now suppose instead that the Type II expression is large, that is |Tn| ^ 5/2. Using the 
evident bound \bd\ ^ T{d), we conclude 

J2 ^id)\ Yl lw>uKw)fidw)\:^ N6. 

V<d^2N/U N/d<w^2N/d 

Applying Cauchy-Schwarz and the divisor moment estimate (IC.ip once again, we con- 
clude that 

Y "^l Y ^v.>uf^{w)f{dw)\^:^N^6Hog-^N. 

V<d^2N/U N/d<wi^2N/d 

By dyadic decomposition, we thus can find integers W with V /2 ^ D ^ 4N/U and 
N/4: ^ DW ^ AN such that 

Y I Y h,MKw)f{dw)\':^^6Hog~'N, 

D<d^2D W<w^2W 

where Id is the discrete interval {w > U : N/d < w ^ 2N/d}. Applying Lemma [A. 21 to 
remove the cutoff l/^(w), we obtain 

Y I Y Kw)f{dw)e{aw)\^ :^ NSHog-^ N. 

D<d4^2D W<w^2W 

for some a G R/Z. Expanding the left-hand side as 

Y Y Hw,w')f{dw)j{d^), 

W<w,w%2W D<d^2D 

where we use b() to denote a bounded function whose exact form we do not care 
about (see Appendix , the required inequality (14.51) follows from the Cauchy-Schwarz 
inequality in the form of Lemma lA.lOl □ 



5. ORTHGGONALITY to linear PHASE FUNCTIGNS 



As a first application of Proposition 14.21 let us recall the standard proof of Davenport's 
estimate (11.31) . We do this partly for expository reasons, to illustrate the "inverse" 
approach to dealing with Type I and II sums, and also because we will need (II. 3p to 
treat the "major arc" case of quadratic phases in later sections. As we shall see, the 
linear case is particularly easy, because the exponential sums can be easily computed 
(using flA.ip ). Here and in the rest of the paper we will be using some standard tools 
from harmonic analysis, together with the notations ||a;||M/z and ||a;||]R/z_Q, which we 
summarize in Appendix |X1 
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We begin with a partial result, which is weaker than fll.3p in that it only resolves the 
theorem for the "minor arc" values of a, but has the advantage of being completely 
effective, as it does not require any information on Siegel zeroes. 

Proposition 5.1 (Correlation with a linear phase implies major arc). Let a G M, let 

A > 0, and let N be a large integer such that 

\EN<n^2Nf^{n)e{-an)\ ^ log~^ N. (5.1) 
Then there exists D, 1 ^ D <^ N'^/^ , such that 

#{1 ^ ^ 2D : \\ad\W,^ « ^log^^+i^iV} » log'^^-^^ iV. (5.2) 

Proof. We apply Proposition 14.21 with U = V = N^^^ and conclude one of the following 
statements: 

• (Type I sum is large) There exists D, 1 ^ D ^ N"^^^, such that 

\EN/d<w^2N/deiadw)\ > log-^-^/^iV 
for > D log~^^~^ values ofD<d^2D. 

• (Type II sum is large) There exist integers D, W with N^^^ <^ D <^ A^2/3 ^^^^^ 
N/8 ^ DW ^ 8A^ such that 

\ED<d,d'<i2DEw<w,w'<:2we{c(dw - ad'w - adw' + ad'w')\ > log"''^"^^ A^. 

Suppose first that the Type I sum is large. Applying ( lA.ll) we conclude that there are 
> D log"^^"^ A^ values oi d, D < d ^ 2D, for which 

\\ad\\R/z < ^ log-^-^/' AT. 

This implies (15. 2p with some room to spare. 

Now suppose instead that the Type II sum is large. By the pigeonhole principle we can 
find d', w' such that 

\ED<d^2DEw<w^2weioidw — ad'w — adw' + ad'w')\ ^ log^'^^^"'^^ A^ 

and hence by the triangle inequality 

ED<d^2D\Ew<n.^2we{a{d - d')w)\ > log-^^-i^AT. 

Applying ( lA.ll) we obtain 

En<d^2D min (l, ) » log"^^"^^ AT, 

A^||a(d- d')||iR/z 

and thus after a simple averaging argument we establish 

<d^2D:\\ad- ad'\U/^ < ^ log^^+^^ N} ^ D log"^^"^^ N. 
Substituting d := d — d' , we conclude 

#{-2D ^d^2D: ||«J||m/z « ^ log'^+'' N} » D log-^^'^^ A^. 
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Since D ^ N^/^, we can easily remove the degenerate contribution when d = 0. The 
claim (15.21) then follows by symmetry. □ 

The next task is to understand exactly what the condition (15. 2p implies. It is clear that 
it is some sort of "major arc" condition, as it forces a to lie close to a rational number 
with reasonably small denominator. A naive inspection of fl5.2l) would lead one to guess 
that this denominator is of size D or so; however it turns out that one can reduce the 
size of the denominator substantially, to be a power of log A^. Indeed, we have 

Corollary 5.2 (Correlation with a linear phase implies major arc, II). Let a G M, let 

A > and let N be a large integer such that (15.11) holds. Then 

,, ,, log^«(^+^)Ar 

ll"llM/Z,161og8('*+4) AT ^ ]y • 

The implied constant is effective. 



Proof. We apply Proposition 15.11 to obtain D, 1 ^ D ^ N"^/^, obeying (15.21) . li D ^ 

logS(^+4)iv then the claim follows directly from If instead D ^ log^(^+^) A^, 

we may apply Lemma Dth) with / = {1,...,2L'}, 5i < f log^^^+^^ A^, and 82 > 

log-''(^+'') A^ to obtain the claim. □ 

When a is major arc, i.e. when ||a||R/z,Q is small, we may proceed using Proposition 

Proposition 5.3 (Major arc phases are orthogonal to Mobius). Let N be a large integer, 
let a be a real number, and let Q,K ^ 1 be such that \\a\\R/i.Q ^ K/N. Then we have 

for any A> {the implied constant is ineffective). 



Proof. Let 1 ^ M < A^ be a parameter to be chosen later. Then by partitioning the 
interval {A^ < n ^ 2N} into intervals of length M, plus a remainder, we conclude that 

|EAr<„<g27v/i(n)e(-an)| ^ sup lyr y^Ai('^)e(a^)| + O(^). 

\I\=M;IC[N,2N] ^^1 ^ iV 

By hypothesis, we have integers a and 1 ^ q ^ Q such that |q; — ^| ^ ^. We thus have 

KM 

e{—an) = e{~an/ q)e{—{a — a/q)n) = e{—an/q)e{—{a — a/q)ni) + 0{ ^ ) 
for any n, n/ G /. Discarding the constant phase e(— (a — a/q)ni), we conclude 
|IEAr<„^27v/i(n)e(-an)| ^ sup I V7 X^/"(^)e(-an/g)| + 0(-"^^^ 



\I\=M;IclN,2N] M ^ A^ 



Applying Proposition 13.21 (replacing A by 2A) we have 

|^^M^)e(-an/g)| <^ ^log'^^AT. 
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Combining these estimates and making the optimal choice M = q^/'^K ^/^A^log "^A^, 
we obtain the claim. □ 

Combining Corollary 15.21 with Proposition 15.31 (and selecting the parameters A appro- 
priately) we conclude the unconditional estimate 

|Eiv<njS2Af/i(^)e(-«n)| <A log"^ A^, 

uniformly in a G M/Z and for any A > 0. Here the implied constant is ineffective. 
Davenport's estimate (11. 3p then follows from Lemma [A. 71 (with = observing that 
the additional linear phase created by that lemma can be easily absorbed. □ 



6. Orthogonality to linear objects 



Our aim in this section is to prove that the Mobius function /i is orthogonal to various 
functions / : Z — > C of "linear" type. We begin by proving (11.61) . which asserts that 
/X is orthogonal to 1-step nilsequences. Then, in Proposition I6.3l we confirm that is 
orthogonal to a certain type of locally linear phase function. This proposition is needed 
for our later analysis of 2-step nilsequences (indeed, it essentially forms the "major arc" 
part of that analysis; see 

Proof of (II. 6p . Let us begin by recalling what it is we are trying to prove. We have an 
abelian Lie group G and a cocompact discrete subgroup T ^ G. Let F : G/F ^ C be 
any Lipschitz function. Then we wish to show that 

E„e[Ar]/i(n)F((7"x) <A,G/r ||F||Lip log"^ AT (6.1) 

uniformly m g E G and x G G/T. Now G/T is isomorphic to the product of a torus and 
a finite abelian group, and hence to some subgroup of a torus (M/Z)"'. By Lemma [A. 8t 
we may assume that F is defined on all of this torus. Let < £ < 1 be arbitrary. By 
renormalising, we may also assume that ||-F||Lip = 1- By Lemma [A.9[ we may write 

J 

F{x) = J2 cArrij ■ x) + Od{e^/^) 
i=i 

(say), where Cj = 0(1) and J = Od{£~'^)- Writing g = (cti, . . . , a^), we have 

J 

Fig'^x) = J2 Cje{mj ■ x)e{n{aimf^ + ■■■ + a^mf )) + Od{e^/^). 
i=i 

Multiplying by fi and taking the expectation over n ^ N, the contribution of each of 
the J terms here is 0^(log~^ A^) for any A> 0, thanks to (11.31) . We therefore have 

Ene[N]Kn)F{g^x) <^A,d e"^ log"^ A^ + e^'^. 
Optimising this in e and recalling that A> Q was arbitrary, we obtain the claim. □ 

Our other goal in this section is to establish, in Proposition 16. 3^ orthogonality of yU to 
phase functions which are almost linear on Bohr sets. 
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Definition 6.1 (Bohr sets). Let N ^ 1. Let G/T be a 1-step nilmanifold (i.e. a 
compact abelian Lie group). Tlien G/T can be embedded as a closed subgroup of a 
finite-dimensional torus {R/Zy, and we let dG/r{.x,y) := ||a;2/~^||G/r be the metric on 
G/T induced from such an embedding (chosen arbitrarily), where we give the torus the 
metric induced by the norm ( ]A.3[) . For any g & G and any n ^ Z, we define the 
"norm" \\n\\g = \\n\\g^N for all n G Z by the formula 

Mlg := \\9 ||G/r + 1^1, 
and then define the Bohr sets Bg{no, p) G Z for any no G Z and p > as 

Bg{no,p) := {n eZ: \\n - ^ollg < p}- 
Thus we have Bg{no, p) = tiq + Bg{0, p). 

Remarks. These Bohr sets are closely related to the sets Bj^j appearing in Theorem 
12.21 and also to more "traditional" Bohr sets in the literature; see the proof of Lemma 
111.41 below. We observe the sub-homogeneity property ||nm||g ^ |r;,|||m||g for all n, m G 
Z, with equality = I^HI'^llg holding whenever |n|||m||g < c for some constant 

Cc/r > 0. We shall use these facts frequently in the sequel without further comment. 

Some other easy properties of Bohr sets are contained in the following lemma. 

Lemma 6.2 (Bohr set estimates). Let N ^ 1, let G/T he a 1-step nilmanifold, and let 
g eG. LetO< p< 1/2. 

(a) (Lower bound) We have \Bg{0,p) \ >G/r p^'^^/^^^^iV. 

(b) (Doubling property) We have |5g(0, 2p)| <G/r 1^9(0, p)|. 

(c) (Divisibility) For any integer d ^ 1 we have 

\{neBg{0,p):d\n}\:^G/r^\Bg{0,p)\. 

Proof. To obtain (a), we cover G/T by Oc/rip''^'^^^^^'') balls B of radius p/4, and also 
cover {1, . . . , A^} into intervals / of length pN/A. By the pigeonhole principle we can 
find an interval / and a ball B such that S := {n : n E I : g"' G B} has cardinality 
^G/r p~'^'^/^^^^ N. The claim then follows from the triangle inequality. Indeed if n, uq G 
5 then \{n-no)/N\ ^ p/2 and ||c/"-"°||G/r < p/2, and thus S-uq C 5^(0, p). It follows 
that \Bg{0,p)\ ^ \S\. 

The proof of (b) is very similar. We cover the ball with centre and radius 2p in G/T 
by OG/r(l) balls B of radius p/4, and the interval {1, . . . , pN} by 0(1) intervals / of 
length pN/4. By the pigeonhole principle, there is an interval I and a ball B such 
that the set S := {n e Bg{0,2p) : n G / : 5^" G 5} has cardinality >G/r \Bg{0,2p)\. 
Note, however, that if n.riQ E S then|(n — no)/A^| ^ p/2 and ||5'"~"°||G/r ^ p/2, and so 
5 - no C Sg(0,p). It follows that |5g(0,p)| ^ |^|. 

Finally, we establish (c). By the pigeonhole principle there is some residue class Xi-, : = 
{x G Z : X = 6(mod d)} for which |5g(0,p/2) n ^ d-i|5g(0, p/2)|. Note, however. 



QUADRATIC UNIFORMITY OF THE MOBIUS FUNCTION 19 

that if n, no G Bg{0,p/2) fl Xf, then d\{n — Uq) and n — Uq E Bg{0,p). The resuh now 
follows from (b). □ 

As we have remarked, the next result will form the "major arc" part of our analysis of 
2-step nilsequences. It may appear a little technical at this point, but has been designed 
to cover everything we need in the later application. 

Proposition 6.3 (Orthogonality to almost linear phases on Bohr sets). Let N E N be 

large, let G/T he a 1-step nilmanifold, let g E G, let p E (0, 1) and let Bg{nQ, p) be some 
Bohr set contained in {N + 1, . . . , 2iV}. Let ip : Z be a non-negative function 

supported on Bg{no,p) which obeys the Lipschitz estimate 

\'ip{n) — 'ip{m)\ <^ \\n — rn\\g (6.2) 

for all n, m G Z. Let q G [1, A^/100] be an integer, let e G (0, 1), and let (p : Z —>■ M/Z 
be a phase obeying the approximate linearity estimate 

\\(f){x + hi + h2) - 0(x + hi) - 0(x + h2) + 0(x) IIm/z < e (6.3) 

whenever x,x + hi,x + h2,x + hi + h2 G Bg{no, lOp) and q\hi,h2. Then for any k G (0, p] 
we have 

\^N<n^2N^^{n)'^p{n)e{-(|){n))\ <^a,g/v /i^^^/^^'^gMog"^ + (e + fi:)E^<„^2Jv|V'HI 
for all A> Q {the constant is ineffective) . 

Proof. We can divide the interval {A^ + 1, . . . , 2A^} into q residue classes Xi, . . . , Xg 
modulo q. By the triangle inequality it suffices to show that 

\^N<n<^2NKf^) Ix, {n)'4){n)e{-(t){n))\ 

<AG/r K~Vlog"^X+ (£ + fi:)E^<n^27v|^(n)|lx.(^) 

for all s, 1 ^ s ^ q. 

Fix s. Without loss of generality we may assume that Xg fl Bg{no, p) is non-empty, thus 
we may choose Ug G XgCiBglno, p). We work in the group Z/pZ where p G [ION, 20N] is 
some prime, abusing notation by regarding functions on [N,2N] as functions on Z/pZ 
in an obvious way. Let / : Z/pZ — »• C be the function f{x) := tlj{x)e{—(j){x)), and 
similarly let p : Z/pZ — > C be the function }l{x) := p{x)li^^x^2N- Then our task is to 
show 

Kez/pzKx)lxAn)f{x) <A,G/r /t'V log"^ X + (e + fi:)Ejv<n^2iv|V'(n)|lx,(ri). (6.4) 

Now let F : Z/pZ C be the function defined by 

F{h) := lq\hlBg{o,^,)ih)e{(f){ns + h)). 

Observe that if x G fl Bgin^, p) and hi, h2 G Bg{0, k) with q\hi, /12, then from three 
applications of (16.31) we have (since k ^ p) 

(f){x + hi) - 0(x) - (j){ns + hi) + (pius) = Ou/zie) 
(f){x + h2) - 0(x) - (f){ns + h2) + 4>{ns) = Or/z{£) and 
0(x + hi + h2) - 0(x + hi) - (j){x + /12) + (t){x) = Or/z{£), 
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where we use 0^ii{e) to denote a quantity whose || ■ norm is 0{e). Summing these 
three bounds yields 

(t){x) = (t){x + hi + h2) - (pius + hi) - (t>{ns + /ig) + 20(raJ + O^/zie), 

which of course implies that 

e(-0(x)) = e(-0(x + h + /i2))e(0(n, + /ii))e(0K + /i2))e(-20(n,)) + 0{e). 

From fl6.2l) . the Lipschitz assumption on ip, we know that iIj{x + hi + /i2) = ip{x) + 0(k) 
for hi,h2 G Bg{0, k). Hence we conclude that 

f{x) = fix + h + h2)F{h)F{h2)e{-2<j){n,)) + 0{e + k) 

for all X G Z/pZ and hi, h2 G -8^(0, n) with q\hi, h2. Since |/(a;)| ^ il){x) pointwise, we 
may sum over and deduce that 

^xez/pm{x)lxs{x)f{x) = Eh^^h2£Bg(o,,,)^x&/pzfi'{x)f{x + hi + h2)F{hi)F{h2)e{2(t){ns)) 

q\hi,h2 

+ 0{{e + K)EN<n^2N\Hn)\lxAn)). 

To prove (16.41) . then, it suffices to show that 

^hi,h2<^Bg{o,Ky,q\hi,h2^xez/pi.fi'{x)fix + hi + h2)F{hi)F{h2) <A,G/r k,^^ \og~^ N . 
From Lemma 16.2( a) and (c) we have 

#{/iG5g(0,ft:) :g|M> V"", 

and so it is enough to prove that 

Eh^M,x&lp'LKx)f{x + hi + h2)F{hi)F{h2) <^A log-^ AT. 

To demonstrate this we use the Fourier transform^ on Z/pZ, noting in particular the 
identity 

E.MM&/pzKx)f{x + hi + h2)F{hi)F{h2)= Yl ?(0/(-0%)'- 

Since / and F are bounded, we see from Plancherel's formula that |/(— 01 = ^(1) ^-^d 
Eeez/pzl-^(OP = 0(1). Also, from we have J(0 <a log~^A^ for any ^ The 
claim follows. □ 

Remark. What we have in effect done here is approximate il!{n)e{—(f){n)) by something 
akin to a dual function coming from the Gowers f/^-norm. By the general theory of this 
norm we know that any bounded function which is orthogonal to all linear exponentials 
(cf. (II. 3p ) is orthogonal to all such dual functions. The Fourier argument at the end 
of the proof of Proposition 16.31 is basically the standard proof of this fact. See [13] for 
further discussion. 

Remark. The results of this section may be used to show that fi is orthogonal to various 
other types of function, which need not be Lipschitz or even continuous, but which 
are still somehow "approximately linear" in n. Examples of such functions include the 
bracket-linear phases e{f3i\_ain\ + ■ ■ ■ + f3d\_adn\). We omit the details. 

^If g : Z/pZ ^ C is a function, and if ^ G Z/pZ, we write g(^) := ]Ei:ez/pzff(2;)e(— a;^/p). 
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7. Orthogonality to quadratic phases 



In this section our aim is to prove the estimate fll.7p . Strictly speaking, this section is 
unnecessary, since (11 .Zp does not represent the heart of the Main Theorem in the same 
way that fll.Sp forms the substance of fll.6p . See the introduction for some remarks on 
this point. 

This section is included for two pedagogical reasons. First of all the argument does have 
some features in common with the (far more complicated) analysis of later sections, 
and thus introduces the main ideas of those sections in a simplified setting. Secondly, 
it represents a good opportunity to introduce some notation for inequalities which will 
be very helpful for the rest of the paper. 

The definition of asymptotic orthogonality involves establishing that X <^a log""^ N, 
for various quantities X and for all A > 0, and it is convenient to have a notation 
specific to this kind of situation. In each argument that follows, the value of A will be 
arbitrary, but fixed throughout the argument. When we write X Y or Y ^ X , we 
mean that 



for some constant C which does not depend on A, and some constant Ca which can 
depend (possibly in an ineffective manner) on A. The constants C and Ca can be 
different in different instances of this notation. In all our arguments the exponent C 
can be chosen effectively, but it may not be possible to give an explicit value of Ca due 
to the possibility of Siegel zeros. 

In some cases, statements of the form X '^Y will appear as both hypotheses and con- 
clusions of a proposition. In such cases it is understood that the implied constants in 
the conclusions are dependent on the implied constants in the hypotheses. Somewhat 
more subtly, in the course of an argument we may divide into several cases using this 
notation (e.g. we may divide into two cases X '^Y and X '^Y). Once again, the im- 
plied constants in the conclusion of this argument will depend on the implied constants 
used to create the division of cases. When necessary we shall draw attention to these 
dependence-of-constants issue^. 

Our argument here shall broadly follow that used to prove (II. 3p in ^ We begin with 
the analogue of Proposition 15. 1[ 

Proposition 7.1 (Correlation with quadratic phase implies major arc). Let a,f3,'y be 
real numbers, A > 0, and let N be a large integer such that 



One can of course rewrite all the arguments in this paper replacing every appearance of X ^ y or 
Y ^ X hy suitably explicit long-hand forms ()7.ip . although some of the constants may be ineffective. 
However we have found that this tended to clutter the estimates with distracting numerical constants, 
and so we have chosen instead to suppress all of these constants. 



X\ ^ CaY log^'^^^^^ N 




lE7v<n^2Af/iHe(-an^ - /5n - 7)] ^ log ^ A^. 



(7.2) 
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Then there exists D, 1 ^ D <^ N'^^^ , an integer q' ~ 1 and a 9 eM. such that 

#{d e {D, 2D] : \\qad^ - 9\\^,^ <^}>D. (7.3) 
Furthermore if D < N^^^ we can take 9 = 0. 

Proof. We can discard the constant phase e(— 7). As before, we apply Proposition 14.21 
with U = V = N^^^ and conclude one of the following statements: 

• (Type I sum is large) There exists D, 1 ^ D <^ N"^^^, such that 

\^N/d<w^2N/de{ad^w^ + (3dw) \ Z 1 
for ^ D values of d G {D, 2D]. 

• (Type II sum is large) There exist integers D, W with N^^^ <^ D <^ A^2/3 ^^^^^ 
N/4: ^ DW ^ m, such that 

\^D<d,d'<i2D^W<w,w'^2We{4>{dw) - (j){d'w) - (f){dw') + (/)(rfV))| Z 1 

where 0(n) := an^ + /?n. 

Suppose first that the Type I sum is large. Applying Lemma lA.lll we can find an 
integer q' ~ 1 such that Hgrf^aHig/z ~ D'^/N^ for Z D values of D < d ^ 2D, which 
implies (Q (with ^ = 0). 

Now suppose instead that the Type II sum is large. By the pigeonhole principle, we 
can find d', w' such that 

\^D<d<i2D^W<wfi2W(i{.(t){.dw) - (f){d'w) - (j){dw') + (j){d'w'))\ Z 1 

and hence 

\^w<w^2we{(l){dw) — (j){d!w) — (j){dw') + (j){d!w')) \ Z 1 

for Z D values of d. Now the phase (f){dw) — (f){d'w) — (f>{dw') + (f){d'w') is quadratic 
in w with a leading coefficient of ald"^ — {d'Y). We may thus apply Lemma [A. Ill and 
conclude that there exists g ~ 1 such that 

||ga(rf2-(d')^)|U/z^^. (7.4) 



ATS' 

Pigeonholing in g, we conclude there exists a single value of q such that (17.41) follows 
for Z D values oi d E {D, 2D]. Setting 9 := qa{d'y, the claim follows. □ 

By using Lemma IA.41 we can now conclude the analogue of Corollary 15. 2[ 

Proposition 7.2 (Correlation with quadratic phase implies major arc, II). Let a,/?, 7 
be real numbers, A > 0, and let N be a large integer such that (17.21) holds. Then we 
have 

||«||r/z,q ^ N'"^ 

for some Q ~ 1. 
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Proof. We apply Proposition 17.11 to obtain D, 1 ^ D ^ N"^^^, and g' ~ 1 obeying fl7.3p . 
Ii0 D ^ 1 then certainly D <^ N^^^, and so we may take 9 = 0. There then exists 
d e iD^2D] such that 



Ar2 ~ - ■ ' 
and the claim follows on replacing q by qd"^. 

Now let us suppose that D 1. We will not be able to apply Lemma lA.121 as it is 
not sufficiently "amplified" for our use here. Instead, we use the triangle inequality and 
(17. 3p to obtain 

if{d,d' e iD,2D] : Wqaid' - (rf')')|k/z ^ ^} ^ D\ 
The diagonal case d = d' is negligible since D ^ 1, i.e. 

e {D,2D] -.d^d', - (f/')')lk/z ^ ^} ^ (7.5) 

Writing — (rf')^ = (ii(i2, where di := d — d' and := + rf', we conclude 

#{rfi,rf2 : 1 ^ Mil, Irfsl ^ AD : \\qadid2\\m/z ^ j^} ^ D\ 

By refiection symmetry we may take di, d2 to be positive. In particular, for ^ D values 
of di in [1,415], we have 

#{^2 G [IAD] : ||garfirf2||M/z ^ ^} ^ 

Applying Lemma lA.41 (ii) we thus conclude that for each such di, there exists qd^ ^ 1 
such that 

WqadiqaJ-- < 



Applying the pigeonhole principle, we can thus find Q'' ~ 1 such that 

#{1 ^d.^AD: \\qad,q'\U/z < ^} > D. 
Applying Lemma IA.4I (ii) again, we conclude that there exists q" ^ 1 such that 

IkaA'Ik/z ^ 

Since qq'q" ~ 1, the claim follows. □ 

On the other hand, we have the quadratic analogue of Proposition 15.31 

Proposition 7.3 (Major arc quadratic phases are orthogonal to Mobius). Let N be a 

large integer, let a,l3,'j & M/Z, and let Q,K ^ 1 be such that ||a||K/z,Q ^ K/N"^. Then 
we have 

EM<n^2Nf^{n)e{-an^ - (3n - ^) Q'/^K'/^og-^' N 
for any A' > {the implied constant is ineffective) . 



^This is an instance of the subtlety of the ^ notation. By this we mean that D ^ Ca log'"^^"^^^-' N, 
where C is chosen so that if £> > Ca log'^*-'^'''^'' N then the later estimate (|7.5p holds true. 
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Proof. Let 1 ^ M < be a parameter to be chosen later. We can set 7 = 0. Arguing 
as in the proof of Proposition 15.31 we have 

lEAf<n<;2Af/iHe(-an^ - Pn) | < sup I ^{n)e{-av? - Pn) \ + 



\I\=Ad-IC[N,2N] 



By hypothesis, we have an integer a and I ^ q ^ Q such that |a — || ^ We thus 
have 

e(an^) = e(an^/g)e((a — a/q)v?) 

= e(anVg)e(2(a - a/q){n - n/))e((a - a/q)n]) + 0( ) 

= e[ar? I q)e{2{a - a/ q)n)h{a, a/ q, rii) + 0(— — -) 

for any n, n/ G /, where we use the b() notation from Appendix [XI Discarding the 
constant phase h{a,a/q,nj), and absorbing the hnear phase e(2(a — a/q)n) into the 
e(/3n) factor we conclude 

Ejv<n^27V/u(n)e(-an^ - (3n) < sup 1 77 At(^)e(-anV g - | 

|/|=M;7c[Af,2Ar];/3'eR ^^^^ ^ 

iCM^ M 

The function e{an^/q) is periodic of period q, and can thus be decomposed as a Fourier 
series e(an^/g) = Ylb=o '^b^i^n / q) where the coefficients Cb are Gauss sums and can be 
computed explicitly. From Plancherel's theorem and the Cauchy-Schwarz inequality we 
have ^6=0 |cfe| = 0(g^/^) (cf. the proof of Proposition 13. 2p . Applying (11.31) (with A 
replaced by 2A') we conclude that 

J2 Kn)e{-anyq - P'n) <a' Nq^/^ log'^^' A^, 

and hence 

, ^ , 1 n X N -1/2, OA' M 

^N<n^2Nli{n)e{-av? - Pn - <a' ^g'^' log A^ + + — . 

If we set M := K'^/^q^^^N log'"^' N we obtain the claim. □ 

Propositions 17.21 and 17.31 together imply (11. 7p . though the ^ notation does take some 
unravelling. Suppose for a contradiction that (17. 2p holds. Then Proposition 17.21 implies 

that ||a||R/z,Q ^ K/N'^, where we may take K = Q = log'"'''^^"'^^ A^ for some absolute 
C. Proposition 17.31 now implies, taking A' = C{A + 1), that 

E7v<n<:27v/i(^)e(-an2 - /3n - 7) <a' log"^'/^ A^. 

We may clearly assume that C > 3, and so this does contradict our assumption that 
(17. 2p holds, at least if A^ > A^o(^) is sufficiently large. To conclude the proof of (II. 7p . 
one simply applies Lemma [A. 71 with (f = 1. □ 

Remark. It is straightforward to iterate the above argument, as is done in the standard 
theory of Weyl exponential sums, to obtain a generalisation of (11.71) in which an^+/3n+7 
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is replaced by an arbitrary polynomial. We will, however, not pursue this generalisation 
here. 

8. Locally quadratic phase functions, I: a technical reduction 

We now begin the (onerous) task of proving Theorem 12. 2[ Let us begin by recalling the 
statement: 

Theorem 12.21 {fi is strongly orthogonal to local quadratics). Let G/T he a 1-step 
nilmanifold, let F : G/T C be a Lipschitz function, and let g ^ G and x G G/T he 
arbitrary. Let (f) : Bn —>■ M/Z be a phase which is locally quadratic on the Bohr set 
Bn := {n e [N] : F{T^x) ^ 0}. Then we have 

Er,e[N]fi{n)F(jfx)e{-(P{n)) <G/r,A ||F||Lip log"^ A^. 

Our objective in this (rather technical) section is to reduce this to a similar result 
which has certain important technical advantages. The most critical of these is that 
can be extended to a function which is quadratic somewhat beyond the domain 
Bpf = Supp„ F{g'^x). This refined formulation reads as follows. 

Proposition 8.1 (/i is strongly orthogonal to extendible local quadratics). Let g E G, 

X G G/T, uq G Z, and let po ^ (0? 10"^) be a small radius. Suppose that BgijiQ, lOOpo) "is 
contained in {n E 7j : N < n ^ 2A^}, and suppose that 0:Z— >]R/Zzsa function which 
is locally quadratic when restricted to i?g(no, lOOpo). Let ip : ^ be a function 
supported on Bg{no,pQ) which obeys the Lipschitz property 

\ip{n) — ?/'(m)| ^ \\n — mWg for all n, m G Z. (8-1) 

Then we have 

|E^<n^27V/u(n)^(n)e(-0(n)) I <A,G/r log"^ N. (8.2) 



Proof that Proposition \8.1\ implies Theorem \2.2. By renormalising, we may assume that 

ll^llLip ^ 1. 

The essential idea is that a "ball" (say B^q) can be covered by balls Bg{nQ, e) of a much 
smaller radius. Most of these will have the property that Bg{nQ, lOOe) is still contained 
in Bn, and hence that (j) is still quadratic on Bg{no, lOOe). 

We turn to the details. First of all, an application of Lemma [A. 71 implies that it suffices 
to establish the estimate 

EAr^„^2^p(n)(/.(n/iV)F(T;x)e(-0(ri)) <^,G/r log"^ N, (8.3) 

where : R — * M is the function 

(any similar function would work). The phase e{an) which featured in that lemma has 
been absorbed into the quadratic phase e(— 0(n)). 
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We now replace F by a "smooth-thresholded" function F, as constructed in Lemma 
IA.13I Let po G (0, 10~^) be a parameter to be chosen later, and set 6 := lO^po in 
Lemma [A. 131 This provides a Lipschitz function F : G /T ^ M. satisfying properties (i), 
(ii) and (iii) of that lemma. In particular from Lemma IA.13I (iii) we see that 

EN^n<i2Nfi{nMn/N)F{T;;x) = ¥.N^n^2Nli{n)^{n/N)F{T^x) + 0(po). (8.4) 
Now take a partition of unity 1 = Xa on G/F, where 

(i) Each Xa is supported on a ball of diameter at most po/2; 

(ii) Each Xa is bounded in magnitude by 1 and satisfies ||xa||Lip ^G/r Po ^; 

(iii) The number of Xa is O^Pq^^'^^^''). 

We leave the construction of such a partition to the reader: modelling G/F by a torus, 
one may be quite explicit. This partition of unity induces a decomposition 

a 

where F^ '■= Fxa- Note that since both F and Xa are bounded we have, using Lemma 
|Al3](i), that 

ll-^alkip ^ ll-^lkip + IIXallLip ^G/r Pq^ ■ (8.5) 

We may also effect a Lipschitz decomposition 

of if into 0(pq ^) Lipschitz functions with Lipschitz constant 0(pq ^), each supported 
on an interval of diameter po/2. Write ipa^pin) := Fa{TgX)ipfi{n/N). Noting that 

ip{n/N)F{T;x) = Y,^^,P^ 

it follows from fl8.4l) and the triangle inequality that 

W.N^n4^2Np{n)ip{n/N)F{T^x) (8.6) 
<G/r Po ^'^''^ ^ sup \^N<n<^2Np{n)i}a,f){n)e{-(j){n)) \ + po. (8.7) 

Suppose that n, n' G Supp('?/'Q,^/3). Then ipi3{n/N),ipp{n' /N) ^ 0, which means that 
n — n l/A^ < Po/2. Furthermore F^{T^x),Fa,{T^ x) ^ 0, meaning that " Hc/r ^ 
Po/2. It follows that ||n — ^ po) and so the support of ipa^p is contained in some 
ball Bg{no,po). 

We are, of course, going to apply Proposition 18.11 It is therefore necessary to confirm 
that is defined on Bg{nQ, lOOpo), and also to say something concerning the Lipschitz 
constant of 4'a,i3- 

Starting with the first task, suppose that Suppli/ja^/s) C Bg{nQ, po) and that ipa,i3{ni) ^ 
for some ni G BgijiQ^po) (we may clearly ignore those a, (3 for which ip^^p = 0). Then 
ipp{n/N) 7^ and so, due to the choice of y?, we have 7/6 ^ rii/N ^ 11/6. It follows 
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that if n G Bg{no, lOOpo) then \n — ni\/N ^ lOlpo and thus, since po is so small, that 
N < n ^ 2N. We also have that Fa{g"'^x) ^ 0, which implies that F{g'^^x) ^ 0. Now if 
n e Bg{no, lOOpo) then dc/rig^'x, g"'^x) ^ lOlpo- It follows from Lemma lA. 131 and our 
choice of 6 that F{g"'x) ^ 0. We have shown that Bg{no, lOOpo) C 5^, and hence is 
indeed defined on the desired set. 



We now examine the Lipschitz constant of 1^0,(3, with the || ■ \\g metric on Z. We have, 
recalling flS.Sp . that 



\n — n" 



g 



\^l3{n) - < Po^ '" ' ^ Po^\\n - n'\ 

and 

iFo^ig^'x) - « Po i^?"""'llG/r ^ Po'\\n - n\\g. 

Since both Fa and ipp are bounded, the Lipschitz constant of ipa,f3 is Oc/riPo^)- 

We are now in a position to apply (a renormalised version of) Proposition I8.1[ We 
deduce that 

Eiv<n<27Vyu('^)V'a,/3(^)e(-0(n)) <A Pq^ \og'^ N 
uniformly in a, (3. Thus, from (18 ■71) . we see that 

Er,^rr^2Nfi{nMn/N)F{g^x) «G/r Po''''''^'^ log-^iV + p,. 

Setting po := log^"^^^*" A^, and recalling that A can be arbitrary, we do indeed conclude 
Theorem [221 □ 



It will be convenient later on (in the proof of Lemma ril.4p to add some further technical 
assumptions to the hypotheses of Proposition 18. 1[ We may assume that ^p is real. Next, 
recall that G/T was embedded isometrically in a torus (M/Z)"'; we may in fact simply 
replace G/T by that torus (M/Z)"' (using Lemma rA.8p since this does not affect anything. 
It will be convenient to work in Z/pZ where p is some prime between lOMA^ and 20MN. 
We can approximate the group element g by the nearest p*^ root of unity g in G/T, thus 
ll5'~^5'l|G/r ^ and g^ G P. Observe that the || ■ and || • \\g norms are comparable, 
thanks to the factor of |-^| in the definition of these norms. Thus we may, after making 
some trivial adjustments to the constants such as 100 in the proof of Proposition 18. H 
replace g by g, that is we may assume that g is a p^^ root of unity. 



9. Locally quadratic phase functions, II: Explicit quadratic and 

QUARTIC BEHAVIOUR 

We now begin the proof of Proposition 18. 1[ We are going to show that if (18. 2p is false 
then the phase is somehow "major arc". Ultimately we will relate it to the type of 
phases in Proposition 16.31 which, in view of the main result of that proposition, will 
lead to a contradiction. We have already seen several instances where a hypothesis 
that the Mobius function p correlates with some phase implies that the phase is "major 
arc" : Propositions 15.11 15.21 17.11 and 17.21 are examples of this. In those cases the phase 
involved, being either linear or quadratic, was of a simple algebraic kind, but the phases 



28 



BEN GREEN AND TERENCE TAG 



that interest us now are not so explicitly given. The two technical lemmas in this section 
show that these phases do, nevertheless, enjoy some algebraic structure. 

Suppose, for the remainder of the section, that : Bg{no, lOOpo) M./Z is a locally 
quadratic phase. If \\h2\\g ^ 30po then we define 

(t>"{hi, h2) := 0(^0 + hi + h2) - (t>{no + hi) - (j){no + /i2) + 0(^o)- 

This expression is clearly symmetric in hi,h2. Since is locally quadratic on the Bohr 
set Bg{no, lOOpo), we conclude the "Taylor expansion" 

(p"{hi, h2) = (pin + hi + h2) - (pin + hi) - (pin + /ig) + (pin) (9.1) 

whenever n G i?g(no, 40po). By telescoping the right-hand side, we conclude the local 
bilinearity properties 

(p"{hi + h[, h2) = (p"ihi, h2)+(p"{h[, h2); /i2 + /i2) = h2)+(p"{hi, h'2) (9.2) 

whenever \\hi\\g, \\h2\\g, \\h'i\\g, Wh'^Wg ^ 15po. 

As another corollary of Lemma 19.11 we see that behaves like a genuine quadratic 
function on certain short arithmetic progressions: 

Corollary 9.1 (Explicit quadratic structure). If n E Bg{nQ,20pQ) , L G Z and h G 
Bg{0,20pQ/ L) , then there exist G M/Z {depending on n and h) such that 

(f){n + hi) = - l)(p"{h, h) + al + l3 

for alll, 1^1 ^ L. 

Proof. From fl9.ll) we obtain the recurrence 

0(n + h{l + 2)) - 20(n + h{l + 1)) + 0(n + hi) = (p"{h, h) 
for alH, 1 ^ / ^ L — 2. The claim follows. □ 

This corollary is strong enough for us to understand the behaviour of the Type I sums 
which will appear when, in subsequent sections, we analyse 

Ene[N]Kri)ip{n)e{-(p{n)) 

using Proposition 14.21 The corresponding Type II sums are more difficult. The basic 
issue here is to understand the algebraic structure of the expression ip{dw)e{(p>{dw)), as 
a function of d and w. Since is already quadratic, the phase (p{dw) here is quartic 
(think of it as being like d'^w'^). We would like some analogue of Corollary 19.11 that 
makes this quartic structure manifest, for instance we would like (p{{d + sl){w + tm)) 
to exhibit some explicitly quartic behaviour in / and m, under suitable hypotheses on 
d, s, I, w, t, m of course. This turns out to be a little tricky, because of the cross terms 
tdm and slw present in the expression {d + sl){w + tm). By introducing suitably many 
constraints (which will be available to us after later arguments) and taking enough 
differences of the phase, we can eliminate these cross terms and obtain the sought-after 
quartic structure. 
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Lemma 9.2 (Explicit quartic structure). Let d,w; s,t and L, M be integers such that 

LM\\st\\g <: po (9.3) 

and let P : 7j X 7j ^ Tj be the quadratic polynomial 

P{l,m) := (d + sl){w + tm). 

Suppose that the integers /q, ^i, hi ^o-, ^2 are such that |mj| ^ L, and furthermore 
that all sixteen of the values 

Pilo + iih + i2l2,ji'mi+ j2m2), 11,12, ji, j2 ^ {OA}, (9.4) 

lie in Bg{nQ, po). Then we have 

(-iy^+^^+n+r2^^p(^i^^,j^^,^i^^^^^j^^^^j^^^-^-^ = 2hhm^m2<p"{st,st). 

n,«2 ji,j2e{o,i} 

(9.5) 

Remark. This lemma is a generalisation of the observation that if (t){n) = an^ + 6n + c is 
a quadratic, and one differentiates (I){P{1, m)) twice in the I variable and twice in the m 
variable, one gets 2 x (p" x st x st, where 0" = 2a is the double derivative of 0. It is key 
here that we have the sixteen constraints (19. 4p : this gives us sufficient instances where 
(19. ip and (19.20 may be applied. Later arguments (involving many applications of the 
Cauchy-Schwarz inequality) will put us in a situation where we have such a multiplicity 
of constraints at our disposal. 

Proof. By replacing d,whjd + loS and w + mot we may assume that Iq = niQ = 0. Let 
Ii.il2,mi,m2 be as in the hypothesis of the lemma, that is to say |Zj|, |mj| ^ L and the 
sixteen constraints (19. 4p are satisfied. From the identities 

dw = P(0,0) 

swh = P{k, 0) - P(0, 0), swh = P{l2, 0) - P(0, 0) 
tdmi = P(0, mi) - P(0, 0), tdm2 = P(0, ma) - P(0, 0), 

we see that 

dwEBg{no,pQ) and swh, swl2,tdmi,tdm2 E Bg{0,2pQ). (9.6) 
Now fix ii,i2 & {0,1} and consider the sum 

(l){P{iih+i2l2,mi+rn2))-(p{P{iih+i2h,mi))-(t){P{iili+i2l2,m2))+(p{P{iili+i2l2,0)). 

(9.7) 

We can rewrite this as 

0(n + hi + /la) - 0(n + hi) - 0(n + /is) + 0(n) (9.8) 

where n := w{d+iisli+i2sl2), hi := {d+iisli+i2sl2)tmi and h2 := {d+iisli + i2sl2)tm2. 
From (19. 3p and (19. 6p we see that n G Bg{no,5po), and that hi,h2 G 5^(0, 4po)- Thus 
all four of n, n + hi, n + h2,n + hi + h2 lie in BgijiQ, 13po) and (19. ip is applicable, which 
means we can rewrite (19. 8p as 

0"((c/ + iisli + i2sl2)tmi, {d + iisli + i2sl2)tm2). 

Applying (19. 2p and (19. 6p . (19. 3p . we can expand this as 

dnHD =X + iiY + i2Z + 2iii2limil2m2(j)"{st, st) 
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where X, Y, Z are quantities which depend on 0, d, s, t, li,mi,l2, '^2 but are independent 
of ii,?2- If one then takes an alternating sum of this identity over the four possible 
choices of ii, 12 G {0, 1} to eliminate the X, F, Z terms, one obtains fl9.5l) . □ 

10. Quadratic bias implies majgr arc 

With the above preliminaries out of the way, we now begin the proof of Proposition 18.11 
in earnest. In this section we shall establish the main step of this proof, namely that a 
quadratic bias necessarily implies a "major arc" condition on cf). We persist in our use 
of the notations X "^Y and X ^Y , which were introduced in ^ Recall (cf. (17.11) ) 
that X ^ y means that 

X ^ CA^log^^^+^^X 

for some constant C which does not depend on A. That constant is, from now on, 
allowed to depend on the underlying 2-step nilmanifold G /V (in actuality, it will depend 
on the dimension of that nilmanifold). The constant Ca is of course also allowed to 
depend on G/T. Recall also from Appendix |X] the notation 

||«||r/z,q := sup ||ga||R/z- 
The main result of this section is as follows. 

Proposition 10.1. Let the notation and assumptions be as in the previous section. 
Suppose that 

|E^<n<27v/i(n)V'(ri)e(-0(n))| ^ log-^iV. (10.1) 
Then there exist Xq ^1, D ^ 4A^2/3 Q ~ 1 with the following property: for any X 
with Xo< X < iVi/i°, there exists a set V C [1,D], \V\ > D/X^/^ such that if d e V 
and w eTj satisfies \\dw\\g ^ 1/X then 

\\ct>"{dw,dw)\\^/^,Q<X-\ 

Remark. The conclusion here is an assertion that 0"(/i, h) is major arc for many values 
of h. We shall recast this conclusion into a more tractable form in the next section 
(in particular it is necessary to show that as w range over the values allowed in the 
conclusion of the proposition, h = dw takes on many different values). 

Proof. Since ip is Lipschitz and supported on B{no,po), we have HV^Hoo Po, and so we 
conclude from (110.11) that 

Po ^ 1 (10.2) 
In practice, this will make it fairly easy to verify hypotheses such as \\h\\g ^ po which 
occur in the lemmas of the previous section. 

We now apply Proposition 14. 21 with f{n) := il){n)e{(j){n)) and U = V = N^^^ to conclude 
one of the following statements must be true: 

• (Type I sum is large) There exists an integer 1 ^ D ^ A^^/s g^^]^ thaX 

\RN/d<w<^2N/d'4^{dw)e{(l){dw)) \ Z 1 (10.3) 
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for ^ D integers d such that D < d ^ 2D. 

• (Type II sum is large) There exists integers D, W with ^N^^^ ^ D ^ 4A^2/3 
and N/4: ^ DW ^ AN, such that 

\^D<d,d%2D^w<w,w%2wi^{dw)ij{d'w)'ilj{dw')ip{d'w')x 

X e{(f){dw) - (j){d'w) - (f){dw') + (f){d'w'))\ > 1 (10.4) 

We can thus assume that either f llO.Sp or (110.41) holds, and see what this implies about 
0. We handle the two cases separately. 

Large Type I sums. Let us consider the (substantially simpler) Type I case when (110.31) 
holds for many values of D. The bulk of the argument is contained inside the following 
lemma. 

Lemma 10.2 (Large Type I sum implies major arc). Let d, D ^ d < 2D, be such that 
fll0.3l) holds, that is to say 

\EN/d<w^2N/d^{dw)e{(f){dw)) \ Z 1. 
Assume that N is large depending on A. Then there exist Q ~ 1 and £ ~ 1 such that 

\\(j)"{dt,dt)\\^/^^Q<L-^ 
whenever L ^ 1 and t ^ Z is such that \\dt\\g ^ e/L. 

Proof. The idea is to analyze the quantity in (110.31) locally on short progressions of 
common difference t and length L. Since ip is supported on {N, 2N], we have 

\2_^ip{dw)e{^{dw))\ Z —. 

From the averaging identity 

J2fM = J2^i^^^Lf{w+ti), 

w w 

valid for any compactly supported function / : Z C, we conclude 

\2_^Ei<^i^L'ip{dw + dtl)e{(f){dw + dtl))\ > —. 

Since ip is supported on (A^, 2A^] and 

\dtl\ ^ LN\dtl/N\ ^ LN\\dt\\g ^ sN, 

we see that in this sum we still have the constraint \dw\ = 0{N), and whence w = 
0{N/D). Thus by the pigeonhole principle we can find w such that 

l^i^ii^L-ipidw + dtl)e{(f){dw + dtl))\ > 1. 

By (18.11) we have 

^p{dw + dtl) = ^pidw) + 0{l\\dt\\g) = ^/j{dw) + 0{e) 
and hence (if £ ^ 1 is chosen suitably small) 

\Ei<ii^L^{dw)e{(j){dw + dtl))\ > 1. 
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Since xl){dw) is bounded and independent of Z, it can be discarded and this becomes 

\^i^i^Le{(t){dw + dtl)\ > N. 

We apply Corollary 19.11 with n := dw and h := dt. We may assume, in view of fllU.2p . 
that e ^ 20po which means that h G Bg{0,20pQ/L). Since n G Supp('?/') C BgiriQ^po), 
Corollary 19 . 1 1 does indeed apply and we may infer the existence of a, /? G M/Z such that 

|Ei^i^Le(i/(/ - l)0"(rft, dt) + al + P)\>N. 

Now if L ^ log*"*-"^^^^ A^, for sufficiently large C, then Lemma [A. 1 II applies and we may 
indeed conclude that ^ If L is not this large then (because so 

much may be hidden inside the ^ symbol) the conclusion is trivial anyway. □ 

The deduction of Proposition 110.11 in the Type I case is almost immediate. Indeed from 
the preceding lemma we see that for ^ D values of (i G [-D, 2D) we have 

\\(j)"{dt,dt)\\^/i^Q ^ L"^ 

whenever t G Z is such that \\dt\\g ^ e/L. Now simply let V be the set of such d, set 
L := X/e, and require that Xq ^ 1 be large enough that L ^ 1 whenever X > Xq. 

Large Type 11 sums. We move on now to the much more complicated Type II case where 
( 110.41) holds. That is to say, we work under the assumption that 



\^D<d,d%2D^w<w,w%2wi'{dw)i!{d'w)4){dw')4){d'w')x 

X e(0(rfw) - ^{d'w) - <p{dw') + <p{d'w'))\ > 1 
where ^N^^^ ^ ^ 4^2/3 ijy ^ jjy^ ^ 

Lemma 10.3 (Type II sum implies major arc). Let |ArV3 ^ ^ 4iv2/3 5e such that 
jN ^ DW ^ 4A^ and (110.41) holds. Assume that N is large depending on A. Then 
there exist Q ~ 1 and e ~ 1 with the property that 

\\(l)"{st,st)\\^/^^Q<l/L^M^ 

whenever s, t G Z and L, M G Z+ are such that L\s\ ^ eD, M\t\ ^ eW , L,M ^ 1/e 
and \\st\\g ^ e'^/LM. 

Proof. It will be convenient to use the b(xi, . . . ,Xk) notation introduced in Appendix 
lAl Thus for instance we can write (110.41) as 

\ED<d4%2D^w<w,w%2wi'{dw)e{4>{dw))h{d,w')h{d',w)h{d',w')\ Z 1. 

By the pigeonhole principle, we can thus find d', w' such that 

\ED<ds^2D^w<w^2wip{dw)e{(f){dw))h{d,w')h{d',w)h{d',w')\ Z 1. 

which upon relabeling the bounded functions b becomes simply 

where X is the quantity 

X := ED<ds^2D^w<w^2wi^{dw)e{(j){dw))h{d)h{w). 
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Now we argue somewhat as in the proof of Lemma 110.21 averaging d and w over arith- 
metic progressions. For any 1 ^ I ^ L and 1 ^ m ^ M we can make the change of 
variables d —>■ d + si, w —>■ w + tm to obtain 

X = ED-sl<d^2D-sMw-tm<w^2W-tm'ip{{d + sl){w + tm))x 

X e{(f){{d + sl){w + tm)))h{d + sl)h{w + tm). 

From our assumption that ^ eD and M|t| ^ eW we infer that 

X = ED<d^2DEw<w<:2wi'{{d + sl){w + tm))e{(f){{d + sl){w + tm)))x 

X h{d + sl)h{w + tm) + 0{e). 

Averaging over I and m gives 

X = ED<d^2DEw<wi^2wEi^i^LEi<^m^Mi'{id + sl){w + tm))e{(l){{d + sl){w + tm)))x 

xh{d + sl)h{w + tm) + 0(e). 

If £ ^ 1 is sufficiently small, the assumption that X ^1 implies that 

\ED<d<;2DEw<w<i2wEi<;i<^LEi<^mi^Mi^{{d + sl){w + tm))e{(j){{d + sl){w + tm)))x 

xh{d + sl)h{w + tm)\ Z 1 

Hence by the pigeonhole principle there exist d, w such that 

|Ei<:«s;LEis;m«:A/V'((c? + sl){w + tm))e{(j){{d + sl){w + tm)))h{d + sl)h{w + tm)\> 1. 

Fix such d, w. By relabeling the b's, we can write h{d+sl)h{w+tm) simply as b(/)b(m). 
We also set 

P(/, m) := {d + sl){w + tm). 

We have, then, that 

|^/(/,m)b(/)b(m)| >LM 

l,m 

where 

/(/, m) := ^lj{P{l, m)))e(0(P(/, m)))U^i^Lli^„,^M ■ (10.5) 
Using Lemma [A. 101 to eliminate the b(/)b(m) factors, we conclude 

I Yl /(^, rn)f{l, m')f{V, m)f{l', m') \ > L^M\ 

1,1', m,m' 

We write / = Iq, I' = Iq + li, m = m^, m' = + mi to obtain 

I J2 $^i^(/o,mo;/i,mi)| >L2m2 

h,mi lo,mo 

where 

F(/o, mo; h, mi) := /(/q, mo)/(/o, mo + mi)f{lo + h, mo)/(/o + /i, mo + mi). 
Applying Lemma lA.lOl again, this time in the Iq and mo variables, we see that 

I ^ ^ F{lo,mo;li,mi)F{lo,mQ-Ji,mi)F{rQ,mo;h,mi)F{l'Q,m'Q-Ji,mi)\ 

> L^M^. (10.6) 
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Writing I'q = Iq + h, ttIq = tjiq + m2, this becomes 

I ^ G(/o,/l,/2,"^0,"^l,?^2)| ^ ^^M^ 

lo,h,l2,mo,mi,m2 

where 

G{lo, h, k, mo, mi, m2) := F(/o, mo; h, mi)F(/o, mo + m2; /i, mi) 

F(/o, mo + h, mi)F(Zo + m2, mo + m2; h, mi) 

■Q ^,,+,,+,,+,,^^^^ ^ ^^^^ ^ ^^^^^ ^ ^.^^^ ^ ^.^^^^ 

(il,«2ji,i2)G{0,l}'l 

and C : 2; I— ^ is the conjugation operator. Observe that the support of the sum in 
fll0.6l) is still contained in the region ^ L, |mj| ^ M. By the pigeonhole principle, 
we can find /q and mo such that 

I J2 n C'^''^^'^''f{lo+iih+t2l2,mo+jmi+j2m2)\>L^M^. (10.7) 

h,l2,mi,m2 (ji,j2 Ji,j2)6{0,l}* 

Let us now expand the product using (110.51) : this creates a very long product involving 
sixteen phases (coming from the terms e(0(P(/,m))) in the definition of /) and fourty- 
eight cutoffs (coming from the terms '?/'(-P(/, m))li^;<g2.1i^m<M)- The sixteen phases 
e(0(P(Z,m))) combine to form a single phase 

e( Yl i-iy'^''^''^''(t>iPilo + iih + ^2l2, mo + jm^ + ^2^2))) . 

n,«2 ji,j2e{0,i} 

The presence of the fourty-eight cutoffs is just what we need to apply Lemma [921 which 
allows us write the phase in (110.71) as 

e(2/i/2mim20"(st, st)) . 

Note that the condition (19. 3p required by that lemma is a consequence of the condition 
\\st\\g ^ e'^/LM we are working under here, provided that e is chosen sufficiently small; 
indeed recall from (110.21) that po ^ 1. 

The fourty-eight cutoffs have now served their purpose of explicitly quartilinearising 
the phase, and we shall now set about obliterating them with further applications of 
the Cauchy-Schwarz inequality. To do this, we observe by inspection that fourty-seven 
of these cutoffs depend on at most three of the variables Ii,l2,m,i,m2, with the lone 
exception being ip{P{lo + /i + I2, mo + mi + m2)). Also, let us recall once more that the 
cutoffs restrict /i,/2 to have magnitude at most L, and mi,m2 to have magnitude at 
most M. We thus have 

I ^ e(2/iZ2mim2</)"(st, st))^/'(P(/o + /i + /2, mo + mi + m2))x 
\h\,\hKL 

\mi\,\m2\^M 

X b(/2,mi,m2)b(Zi,mi,m2)b(/i,Z2,"^2)b(/i,/2,"^i)| ~ L^M^. (10.8) 

We would like to eliminate all the b() factors using Lemma [A. 101 but we need to deal 
with the exceptional cutoff ip{P{lo + h + h^mo + mi + m2)) first. First observe that 
if ip were a multiplicative function then the quadratic nature of P would ensure that 
il){P(lo+li+l2i mo+mi+m2)) would factor into the product of expressions, each of which 
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only depends on at most three (in fact, at most two) of the Zi, I2, mi, m2- Of course, ip is 
not multiphcative, but thanks to flS.ll] we can write ip{n) = '^{g"', n/N) for N < n ^ 2N, 
where : G/F x (M/Z) ^ M is Lipschitz on the orbit {(gf", n/N) : N < n ^ 2N} and 
hence, by Lemma [A.81 is the restriction of a Lipschitz function on all of G/T x (M/Z). 
Let 5 ^ 1 be a parameter to be chosen later. Using Lemma IA.91 we can approximate 
uniformly to accuracy 0{6) on {N,2N] by a linear combination of at most 0{6~'") 
characters on G/T x (M/Z), each of which has the form {x,6) 1— > x{^)^{k(^) where 
X G {G/ry and A; G Z. The coefficients in this linear combination are all 0(1). Thus 
we can estimate the left-hand side of (110.81) by 

0{5'^ sup I J2 e(2/i/2mim20"(st,st))x(^^^'"+''+'''"'°+'"^+"'^)x 

X e{kP{lQ + I1 + k^rrio + rrii + m2))b(/2, mi, m2)b(/i, mi, m2)b(/i, l2,m2)h{li, h^rrii)]) 
+ 0{5L^M'^). 

Choosing 5 ~ 1 suitably small, we thus conclude that there exist x ^i-nd k such that 
the inner sum is ^ d^L'^M'^ ^ L'^M'^. By the quadratic nature of P we may absorb 
the terms ^[gPi^o+h+i2,mo+vn.^+m2)^^ and e{kP{lo + h + rrio + rrii + 7x12)) into the four 
unspecified bounded functions b(), thereby obtaining 

I ^ e(2/i/2mim20"(st, st))h{l2, mi, m2)b(/i, mi, m2)b(/i, I2, m2)b(/i, I2, mi)| 

\hl\hKL 
ImiMmal^M 

> L''M\ 

Applying Lemma [A. 101 to eliminate the b() factors, we deduce 

I e(2(/i - /'i)(/2 - Qimi - m'i)(m2 - m'2)0"(st, st))\ > L*M\ 

\h\,\l[\,\l2W^\^L 
|mi|,|m'J,|m2|,|my<M 

By the pigeonhole principle, we can thus find /'i,/2 = 0{L) and m\,m'2 = 0{M) such 
that 

I J2 e{2{h-l[){l2-l2){mi-m[){m2-m2)(f)"{st,st))\>L^M\ 

\h\,\l2KL 
|mi|,|m2|<M 

Summing in m2 using flA.ip . we obtain 

Shifting li, I2, mi by l[, 1'2, m[ respectively, and doubling mi to absorb the factor of 
two this creates, we thus have 

V min(l, — — ) > L^M. 

|/i|,|(2K2L |miK4L 



It follows that 



^ 'M||/i/2mi0"(st,st)||M/z^^ 
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for > L^M triples {l^k, 

mi), which means that 

\\hhmi(f)"{st,st)\\R/z ^ 

for those triples. In particular, we have ^ pairs {h,h) for which this inequality 
holds for ^ M values of mi = 0(M). If M ^ log^i(^+^) N for some sufficiently large 
Ci then we may apply Lemma [A. 41 (ii) with parameters 6i ^ 1/M, ^2 ~ 1 and \I\ ~ M 
to conclude that for each such pair (/i, I2), there exists q' ~ 1 such that 

||/l/2g0"(st,St)||M/Z^^. (10.9) 

This condition on M may be met by choosing e ~ 1 sufficiently small, since one of 
the hypotheses of the lemma was that M ^ 1/e. Applying the pigeonhole principle to 
(110.91) . we can now locate a single g ^ 1 such that the above bound holds for ^ pairs 

(/l,/2). 

Taking e sufficiently small we may assume that L,M ^ log^^(^+^) N for suitable C2 
and apply Lemma lA.41 to I2 instead of mi. The parameters in that lemma are now 
Si ~ (52 ~ 1 and \I\ ~ L, and we conclude the existence of g' ^ 1 such that 

\\kq'qcj)"{st,st)\U/z<j^ 

for ^ L values of Zi. Applying Lemma [A. 41 one last time, now with 61 ~ ^2 ~ 1 

and |/| ~ L, we find a g" ^ 1 such that 

1 



\q"q'q(j)"{st,st)\\^/z 



< 



L2M2 

Since q"q'q' ^ 1, the proof of Lemma [10.31 is complete. □ 

It remains to use this lemma to complete the proof of Proposition 110.11 in the Type 
II case. We take V to be simply the whole interval [D/2X^^'^, D/X^^'^]. There is a 
very important subtlety here: this set of integers can only be guaranteed to have size 
^ if assume that D / X^/'^ ^ 1. Note, however, that in the Type II case this 

is so since we are working under that assumption that D ^ N^^^ and X ^ N^/^^ . This 
is not just a technical artefact of our approach - it is simply not possible to bound a 
general bilinear form, such as the Type II sum 



Tii = ^ ^ ttdb^fidi 



when one of the ranges ~ D or w ~ is too short, as the weights a^, 6^ could 
conspire to give no cancellation. 

Suppose, then, that d E T) and that w G Z satisfies the condition of Proposition 110.11 
namely that \\dw\\g ^ l/X. In Lemma [TOJ] take L = M := eX'^^'^/lO and s := d, 
t := w. If Xo ~ 1 is sufficiently large and X > Xq then certainly the two conditions 
L,M ^ 1/e are satisfied. Furthermore we have 

, fA'i'^ D 
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and 

, \dw\ N\\dwL eN 

and finally \\st\\g ^ / LM by the definition of L and M. All the conditions of Lemma 
110.31 are thus satisfied, and we may infer that 

\\ct>"{dw,dw)U/j^,Q<X-^ 

for some Q ~ 1, as required. □ 

We may now forget about Type I and II sums, and work with the conclusion of Propo- 
sition 110.11 instead. In the next section we will use divisor moment estimates to cast 
this conclusion in a more tractible form. 



11. Massaging the major arc condition 



In the last two sections we showed that if 'il){n)e{—(l){n)) correlates with Mobius (specif- 
ically if (110.11) holds true) then (j) must exhibit some kind of "major arc" behaviour. 
Indeed we proved Proposition 110.11 which we urge the reader to recall now. Our first 
task in this section is to cast the conclusion of that proposition in a more useable form. 
Through this section, we assume that (j) : Bg{nQ, lOOpo) K/Z is a phase for which 
(110.11) . and hence the conclusion of Proposition llO.il holds true. 

Proposition 11.1. Let be as above, and suppose that the parameter pi satisfies 

<Pi<Po log-^^(^+^) (11.1) 

for some c > and some Ci which is sufficiently large depending on G/T {in reality pi 
will be much larger than N~'^, so the lower bound here is hardly relevant). Then 

U"{n,n)\W/^,Q<pl (11.2) 
/or ^ p^^'^\Bg{0, pi)\ values of n G Bg{0,pi), where Q ~ 1. 

Remarks. Note that since p^^"^ is so much bigger than pj, the conclusion is in the spirit 
of the hypotheses of Lemmas such as IA.121 where a quadratic whose fractional part was 
"close to zero unexpectedly often" was shown to be major arc. We will, in fact, apply 
exactly that lemma later in this section. The fact that we can arrange the exponents 
3/2 and 2 in this way is ultimately due to the lower bound \T)\ ^ D /X^/"^ in Proposition 
[TIUl \V\ > D/X would not suffice. 

Proof. Set X := 1/pi in Proposition 110. Ij we may certainly suppose that Ci is so large 
that this is permissible. We find D <^ N'^l^ and a set T) C {1, . . . , D} of cardinality 

\V\ > DjX^I'^ such that 

\(\)\dw,dw)\^l^,Q ^ p\ 
whenever d^D and w & Z are such that dw G Bg{0,pi). Thus, if we define the sets 

Q:= Bg{0,pi)nZ+] Qd ■■= {n e Q : d\n} 
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for each integer ci > 1, it will suffice (noting that Bg{0,pQ) is symmetric about the 
origin) to prove the estimate 

I \Jn,\>pf\n\. (11.3) 



(lev 



Observing from Lemma 16.21 that 



\Q\ > p^N and \Qd\ > j^M ^oi all deV, 

where C depends only on G/T, it follows by taking k, := 1/2C in Lemma rC.2l of Appendix 
Othat 



Since I'D] ^ D/X^^'^, the result follows immediately. □ 

As we remarked, the conclusion of Proposition 111.11 has the form "0"(n, n) is surprisingly 
close to an integer very often" on a small Bohr set Bg{0, pi). The next step is to amplify 
this to obtain 0"(/i, h) major arc for a significantly larger set of h (working on Bg{0, po) 
rather than Bg{0, pi)). More precisely, we now establish a more pleasant characterisation 
of major arc: 

Lemma 11.2 (Major arcs have small second derivative). Let be as above. Then there 
exists Qi ~ 1 such that 

U"ih,h)U/^,Q.<\\h\\l 

for all h G Bg{0, po)- 

Proof. The idea is to make the quadratic structure of (p" so explicit that we can apply 
Lemma IA.12I 

We will choose pi := log"'"^*-"^"^^-' A^, where C2 ^ Ci is some constant to be specified 
later. In particular if C2 is large enough then the conditions of Proposition 111.11 are 
satisfied, and we can find as a result some set S C 5^(0, pi) such that 

\S\^pf\Bg{0,p^)\ (11.4) 

and 

||0"(n,n)|U/^,Q<p? (11.5) 

for all n E S. Note that the implied constants in the ^ and ^ notations here do not 
depend on C2. Note also that for reasons like this one must exercise extreme caution 
with these notations. 

Select some C3 ^ C2. If \\h\\g ^ log^*"^*-"^^^-* A^ then the lemma holds vacuously, and so 
we assume henceforth that \\h\\g ^ log"^='(^+^) A^. Now from ffTOD and Lemma 0(b) 
we have 

3 /2 

^neBg(o,2pi)ls{n + m)> p{ 
for all m G -Bg(0, pi). Applying this to m = hi for all / G {1, . . . , L}, where L := L]|fjj~J 5 
and then averaging in L, we conclude 

^neBg{0,2pi)^l<iKL'^s{n + hi) Z pf ^ 
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and thus by the pigeonhole principle we can find n G -Bg(0, 2pi) such that 

IEi<«<;l15(^ + hi) Z pI'^. 

In particular, we have 

||(/)"(n + /l/,n + /l/)||M/Z,Q ^P? 

for ^ Px'^l^ values of / G {1, . . . , L\. Applying the pigeonhole principle again, we can 
thus find a single g ~ 1 such that 

+ hl,n + hl)\\u/z ~ pi 
for > pf^L values of / G {1, . . . , L}. Now from Corollary EI] (and (H^ ) we can write 

q(f)"{n + hl,n + hi) = ql'^(j)"{h, h) + al + (3 

for some quantities G M/Z which depend on q,(j),n,h but are independent of /. 
Thus 

\\ql^<l)"{h, h) + al + ^ pI 

for ^ p\^'^L values of / G Now Lemma IA.12I applies to exactly this kind of 

situation. In that lemma we take 5i ~ p\ and 62 ~ Pi^^, and note that the requisite 
conditions 5i ^ and L ^ 2^^(5^^^ are handsomely satisfied if C2, C3 are chosen 
judiciously. The conclusion is that 

Setting Qi := 2~'^^5^^ ^ 1 and noting that L ^ the conclusion follows. □ 

In the next lemma, we bootstrap Lemma 111.21 to a depolarized version of itself. 

Lemma 11.3 (Major arcs have small second derivative, II). Let be as above. Then 
there exist Q2 ~ 1 (^f^d P2 ~ 1 such that p2 ^ po and 

ll0"(^,^')lllR/z,Q2 ~ II^LII^'llg 

for all h, h' G Bg{0,p2). 



Proof. Let pa = log-^^^^+'^ A^, for some large C4 to be chosen later. By symmetry we 
may assume \\h'\\g ^ \\h\\g. Let L > 1 be the least integer such that L||/;,'||g > \\h\\g. For 
any Z G {1, . . . , L}, we use (19.21) and the hypotheses h, h' G -Bg(0, P2) to conclude 

4/0"(/i, h') = (f)"{h + lh\ h + Ih') - <f)"{h - Ih', h - Ih'). 

Applying Lemma 111.21 and the triangle inequality, we infer 

m"{h,h')\w/^,Q,<\\h\\i 

and hence 

\\¥'{h,h')U/^,,Q,<\\h\\l 

for all / G {1,...,L}. Let C5 be a further constant to be specified later. If L ^ 
log-^5(^+i) A^ then we can set / = 1 and the argument is finished. Suppose, then, that 
L ^ log"'^^^"^'*'^-' A^. By the pigeonhole principle, we can find 5' ^ Qi ~ 1 such that 

\\ql<P"{h,h')\\u/^<\\h\\l 
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for ^ L values of Z G {1, . . . , L}. We are now in a position to apply Lemma rA.4( ii) with 
5i ~ and 82 ~ 1. If C4 is large enough then we certainly have 5i ^ ^82, whilst C5 
may be chosen so that L > 2j8\. In those circumstances the lemma is applicable and 
we deduce that 

This concludes the proof. □ 

The above lemma says that for any pair /i, h! each having small || ■ \g norms, the second 
derivative /i') is close to a rational number ajq for some small q. However, this q 
can currently depend on /i, h! . Fortunately, it is possible to "clear denominators" and 
make q independent of h\ by taking advantage of a certain "finite dimensionality" of 
the Bohr set Bg{0,p2)- More precisely, we have 

Lemma 11.4 (Major arcs have small second derivative. III). Let cj) he as above. Then 
there exists Ps ~ 1 and an integer <? ~ 1 such that 

\W{h,h')u,^<\\hU\hX 

for all h, h' G Bg{0,p3). 



Proof. We shall use some standard results from the geometry of numbers to obtain a 
"basis" for the Bohr set Bg{0,p2). These result are discussed in several places: see, for 
example, [31 [13] and [221 Ch. 3]. Recall at this point the discussion at the end of §H1 
where we remarked that g G (M/Z)'^ can be taken to be an p^^ root of unity, where 
p G [lOA^, 20A^] is the prime we have associated to for those arguments where it is 
convenient to work in a cyclic group. This is such an argument. We identify Bg{0,p2), 
which is certainly contained in {1, . . . , A^}, with a subset of Z/pZ. Write 

9 = { — ) 
p p 

in (M./7jY, where ^1, . . . G Z/pZ. Let S C Z/pZ be the set of frequencies 

5:= {1,^1,..., a- 

In the notation of ^3], the Bohr set Bg{0, P2) is then comparable to a "traditional" Bohr 
set 

B{S,p) := {x G Z/pZ : Ux/pU/^ < p} 

in the sense that 

B{S, ^2) C Bgif), P2) C B{S, 2p2). (n.6) 

Applying [T3l Corollary 10.5], and redefining d := d + 1, we can then find a propeiE 
generalized arithmetic progression 

P = {l^vi + . . . + IdVd ■■ ^ Lj for all 1 ^ j ^ d} 

for some Li, . . . , > 1 and f 1, . . . , I'd G Z/pZ, such that 

S,(0,cp2) CPCBg(0,p2) 



'By proper we mean that all the sums livi + . . . + IdVd are distinct. 
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for some c = c{d) > 0. In fact by applying that result to Bg{0, ip2) (and redefining P 
and the Lj slightly) we may insist on the slightly stronger inclusions 

5,(0, icp2) C Pi/4 C P C Bg{0, p2) (11.7) 

where Pq is defined for any 9 G (0, 1] by 

Pg ■= {l^vi + ... + IdVd : \lj\ ^ OLj for all 1 ^ j ^ d}. 

We will prove the lemma with := jcp2- Let us note from (111.71) that 

II II ^ ^2 , 1 

for each j, 1 ^ j ^ d. Thus by Lemma [11.31 we may find for each j,j', 1 ^ j,j' ^ d, a 
Qjj' ^ 1 such that 

\\qj,f(j)"{vj,Vj,)\\^/z < 



If we let q be the least common multiple of all the qjj>, then we still have 5' ~ 1 and 



for all J, j', 1 ^ iii' ^ d. Note that at this point the implied constants in the ^ notation 
have become heavily dependent on d. By bilinearity (19.21) it follows that 

W{h,h')U,j^<\\h\\p\\h'\\p (11.8) 

for all h' G P, where the norm || ■ ||p on P is defined by 



Whvi + ... + Idi'dWp ■■= sup 



We claim that \\h\\p ^ \\h\\g for all h G Pp(0,p3). In view of (Ill.Sp . this will suffice to 
prove the lemma. 

We may assume that h ^ since the claim is trivial otherwise. Observe that h G P1/2. 
Let M > 1 be the smallest positive integer such that Mh ^ P1/4; since Mh = {M — l)h+ 
h, we see that Mh G P1/2. Thus ||M/i||p ^ 1/2, which implies that \\h\\p ^ 1/2M (here 
we use the hypothesis that P is proper, which implies that the co-ordinates li,...,ld 
of Mh are M times the co-ordinates of h). On the other hand, since Mh ^ -P1/4, we 
have Mh ^ Pg(0,p3), which implies M||/;,||g ^ p^ and hence that \\h\\g ^ Ps/M ^ 1/M. 
Combining these estimates we obtain the claim, and hence the lemma. □ 



12. Handling the major arcs 



Let us summarise the current state of affairs. In our effort to prove Proposition 18. 1[ we 
assumed that its conclusion (18.21) was false. After a long and complicated analysis, we 
deduced from this assumption that the phase is major arc, in the sense that we have 
an estimate 

\\q4>"{h,h')\\^/^<\\hUh'\\g 

whenever h,h' G 5,(0, ps), for some g ~ 1 and some ps ^ 1. This was, of course, the 
content of Lemma [11.41 To close the argument, we relate major arc phases of this type 
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to those appearing in Proposition 16.31 This is not hard (though a httle technical), and 
leads quickly to a contradiction (of the assumption that fl8.2l) was false). 

Let q be as above. By bilinearity (19. 2p again, we see that 

u"{h,h')U/^<\\hmh'\\g 

for all h, h' G Bg{Q, p^) such that g|/i, h' . Let £ <^ pa, e ^ 1, be a small number to be 
chosen later. Applying (19. ip . we conclude the approximate linearity relationship 

\\(p{n + hi + h2) - (f){n + hi) - (f){n + hi) + 0(n)||R/z ^ £^ < e (12.1) 

whenever n G i?p(no,2po), whenever hi, hi G 5^(0, 20£:) are such that q\hi,h2, and 
provided that e is small enough. 

Now due to the finite dimensionality of the space (M/Z)'^ x M from which the metric 
||r2 — m||g is naturally descended (cf. the remarks following Definition 16. II) we may cover 
Bg{nQ,po) with 0{e~'^) Bohr sets Bg{na,e) such that each point is contained in 0(1) of 
these Bohr sets. This induces a corresponding partition of ip into 0{e~'") functions ipa, 
each of which is supported on a Bohr set Bg^ria, e) and still obeys the Lipschitz bound 

m- 

Now observe that if n,n + hi,n + h2,n + hi + hi G Bg{na, lOe), and if q\hi, hi then 
(112.11) holds. Thus we may apply Proposition 16.31 (with p = e) to conclude that for any 
K ^ e, and for some A' to be chosen later, we have 

\EN<n^2NfJ'iri)'ipa{n)e{-(l){n))\ <a' K"^q^\og~^' N + (e + K)EN<n<i2N\^a\- 

Summing in a, using the bounded overlap of the Bohr sets and the fact that \\ip\\oo ^ 
Po <^ 1, we conclude 

|Eiv<n^2iV/u(n)^(n)e(-0(r2)) I <^a' {eK)'^q^ log"^' N + e + k. (12.2) 

At this point we set|^ k = e = log"*"^"^^^-* for some C > 1 which is so large that (112.11) 
holds. Recalling that g ^ 1, we see that A' may be chosen so that the right-hand side 
of (fTO) is < log-^ N. ~ 

We have, at long last, contradicted the supposition that (18.21) is false. This implies 
Proposition 18.11 By the analysis of ^ Theorem 12.21 is also true, and thus, by the 
deduction immediately after the statement of Theorem 12. 2^ so is the Main Theorem. □ 



Appendix A. Some harmonic analysis tools 

In this appendix we collect some simple harmonic analysis tools which are used fre- 
quently in the paper. We begin by introducing some norms on the unit circle M/Z, 
which can be lifted up to the real line M. 



We kept the parameters e and k separate in Proposition 16.31 for pedagogical reasons, to make the 
dependencies clear. 
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Definition A.l (Circle norms). If a is an element of the real line M or the circle M/Z, 
we use ||a||M/z to denote the distance from a to the nearest integer (if a is real) or to 
zero (if a is on the circle M/Z). If Q ^ 1 is an integer, we use ||q!||r/z,(5 to denote the 
quantity 

:= inf ||ga||R/z. 



The quantity is subadditive, thus ||a + ^ ||tt||R/z + ||/5||m/z- We caution 

however that the quantity ||q!||r/z,q (which is large when a lies in a "minor arc", and 
small when a lies in a "major arc") is not subadditive. 

Define a discrete interval to be any set of the form {ra G Z : a ^ n ^ 6} for some a, h. 
By summing the geometric series, we observe the elementary exponential sum estimate 

I Ve(an)| ^4min(|/|,^ — ) (A.l) 
fit ll«lk/z 

for any discrete interval / C Z and any a G M/Z (or any a G M). One consequence 
of this is the following Polya- Vinogradov type completion of sums lemma, which allows 
one to estimate a partial sum by a completed sum at the cost of a logarithm and an 
exponential phase. 

Lemma A. 2 (Completion of sums). Let I G 7^ be a discrete interval, and / : Z ^ C 
be a function. Then we have 

sup| y'/(n)| < log(l + |/|) sup I y'/(n)e(a?2)| 

JCI aeM/Z ~j 

where the supremum on the left ranges over discrete sub-intervals of I. More generally, 
z/ /' C Z is another discrete interval, and K : Z x Z ^ C is a function, then we have 

5^ l$^lj™H^(«,"^)l'<log'(l + l^l) sup ^ |^ir(n,m)e(an)|2 

me/' ne/ "^^'^/^me/' nel 



where for each m E I' , Jm is an arbitrary discrete interval. 



Proof. We may assume I is non-empty. By translation we may take I = {1, . . . ,L} for 
some L ^ 1, which we then identify with 'LjUL. If J is any interval in 'L/L'L., we can 
use Fourier expansion in 'L/L'L to write 

E/(^)= E Mn)/(r.) 



neJ n& 

^eZ/LZ neZ/LZ 

where 

0(0 :=E„ez/Lzlj(^)e(-</L). 

Applying ( lA.ll) . we have 

1 



|lj(0| <4min(l. 



LU/L\\r/z^' 
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Thus by the triangle inequahty, we have 



< V min(l,— — ^- ) sup |Ve(na)/( 

<log(l + L) sup I ^e(na)/(n)|, 



n] 



which gives the first inequahty. Using similar arguments, as well as the triangle inequal- 
ity in Z^, we have 



CGZ/iVZ ^IK/^IIm/Z 

«log(l + L) sup (V I VeK)^(^,^)r)'/'- 



□ 



In a similar spirit, we now recall the well-known Erdos-Turan inequality: 

Proposition A. 3 (Erdos-Turan inequality). Let {ui)f^^ he a sequence in M/Z, and 
define the discrepancy A(a,/3) for any — | ^ a < [3 < \ by the formula 

A(«, /5) := #{/ G {1, . . . , L} : e [«, 13]} - {{3 - a)L. 

Then for any positive integer Q we have 

|A(«,/?)|<^ + 3f;i|X]e(g«,)|- 

^ q=l ^ 1=1 



Proof. See for instance [19]. The constant 3 is unimportant for us, and could be im- 
proved slightly. □ 

An important application of this inequality for us (which we will use extremely fre- 
quently) will be the following observation, which says that if a linear sequence al stays 
close to an integer for many / in an interval J, then a must be "major arc", in the sense 
that ||a||]R/z,Q is small for some small Q. 

Lemma A. 4 (Recurrent linear functions are major arc). Let I C Z be a discrete 
interval, let a G M/Z, and suppose that the set 

£ := {/ G / : \\al\\R/z ^ 5i} 

has cardinality at least S2\I\ for some < 5i, ^2 < 1 with 61 ^ ^62- 
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(i) If\I\ > 1/62, then ||a||]R/z,8/<S2 ^ T2 



(ii) If\I\ > 2/61 ihen ||a||K/z,i6/52 ^ 



2^ 

6I\I\ 



Proof. Write / = {M + 1, . . . , M + L}, and let {ui)f^^ be the sequence ui := a{M + 
/)(mod 1). Then the lower bound on 2, implies the discrepancy estimate 

Let us now prove (i). Applying Proposition IA.3I we conclude 

^ ^ q=l^ 1=1 

for any Q. Taking Q =: [4/52], this implies that there is g ^ 8/52 such that 

L 

\Y,e{qui)\^2~'6lL. 
1=1 

Applying ( lA.ip . the result follows. 

We now use a standard "amplification" argument, exploiting the smallness of 61 com- 
pared to 62, to bootstrap (i) to the stronger estimate (ii). We may assume that 
61 < 5|/16 since the result follows immediately from (i) otherwise. Let 1 ^ m ^ L 
be an integer to be chosen later; then by the pigeonhole principle and the lower bound 
on there exists some b such that the set 

£j, := {6 + 1, . . . ,6 + m} n £ 

has cardinality at least 62171/2. We fix b, and note that if x G m£ + Sib, that is to say 
ii X = ml + I' with / G £ and I' G then ||aa;||iR/z ^ 2m6i. Furthermore we have 
|m£ + £b| ^ 5|mL/2, and also m£ + £{, is a subset of the interval 

/' := {m(M + 1) + 6 + 1, . . . , m(M + L) + b + m}, 

which has cardinality at most mL. We can apply (i) with J, 61, 62 replaced by /', 2m6i, 
and 62/2, provided that m ^ 62/IQ61 and mL > 2/62- It being sensible to take m 
essentially as large as possible, set m := [(5|/165iJ. The result follows quickly. □ 

Next, we record a version of summation by parts. Define the total variation WiPWty of 
a sequence : Z — >• C to be the quantity 

II^IItv := sup \ijin)\ + ^ |z^(n + 1) - z^(n)|, 

and more generally define the total variation modulo q for any g ^ 1 to be the quantity 

Itv.q := sup |V^(n)| + ^ \ip{n + q) - ip{n)\. 
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Lemma A. 5 (Summation by parts). If f,ip : Z ^ C and I is an interval, then 

I J]/(nMn)|^||^||Tvsup|5^/(n)|. 



n£7 ■^^=^ n£j 



More generally, for any q ^ 1 we have 



^ g||^/'||TV,g sup I y'/(n)l„=a(mod 

nel JCI,aeZ/gZ 



9)1 



Proof. Write / = {u,...,v}, and denote by Sn '■= Yl^=ufU) partial sums of /. 
Recalling the summation by parts formula 



v-l 



J2 f{n)iP{n) = S,^{v) + Snmn) - ^{n + 1)), 

ne/ n=u 

the first inequality follows immediately. The second bound follows by splitting I into 
q residue classes modulo q and applying a rescaled version of the first identity to each 
component. □ 

Corollary A. 6 (Completion of sums, II). Let I G Tj be a discrete interval, and / : Z — > 
C and ip : Tj ^ C be functions. Then we have 

^^(n)/(n)«log(l + |/|)||^||Tv sup \J2f{r^)e{an)\ 

n&I "SR/Z 

and more generally for any q ^ 1 

^V^(n)/(r2) <glog(l + |/|)||V^||Tv,, sup | /(n)e(«n)|. 



Proof. The first part is immediate from Lemmas IA.2I and lA.Si To obtain the second 
bound, we begin with an invocation of the second bound in Lemma IA.5[ It is now 
sufficient to prove that 

sup I y~'/(n)l„=a(mod9)| ^ sup I V'/(n)e(an)|. 

JCI,aeZ/qZ am/Z 

To see this, expand ln=a(mod q) as a Fourier series 

_ 1 ^ / (Q-n)A 

J-n=a(mod l) ~ '^1 ) ' 

^ 5ez/gZ \ y / 

and apply Lemma IA.2I and the triangle inequality. □ 

As a consequence of this Corollary, we can obtain the following convenient lemma, 
which allows us to replace the range 1 ^ n ^ A by a smooth cutoff to the interval 
N < n ^ 2N, at the expense of adding an arbitrary linear phase to the function (which 
in our applications will be totally harmless). 
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Lemma A. 7. Let f : N ^ C be a sequence bounded by 0(1). Let (/j : M — > M 6e a 
Lipschitz non-negative function of Lipschitz norm 0(1) which is at least 1 on [4/3,5/3]. 
Suppose that we know that 

^N<n<i2N^{j;^)f{n)e{an) <a log~^iV 

for all A > 0, N ^ 1, and a G M/Z. Then we have 

/(n)«A,^ log-^iV 

for allA>0 and N ^ 1. 



Proof. For large we can write 

Tl 

^AN/3<n<i5N/3f{n) \^N<n<^2N<^{j;^) f {n)g{n)\ 

where 

gin) = liN/3<n^5N/3V'\j;^)- 

Since ip~^ is Lipschitz on [4/3,5/3], we have ||5'||tv "^ip and hence by Corollary IA.6I 
and hypothesis 

E47v/3<n^5iv/3/(^) sup |EAr<„^27v(/)( — )/(n)e(an) | <^ log"^ A^. (A. 2) 

Now we may decompose the interval {1, . . . , A^} into 0(log A^) intervals of type 4M/3 < 
n ^ 5M/3 together with O(logA^) extra points. Combining flA.2p with the bound 
/ = 0(1), we obtain the lemma. □ 

Another harmonic analysis tool we will need often is to approximate Lipschitz functions 
by exponentials. We first recall a well-known extension lemma: 

Lemma A. 8 (Lipschitz extension). IfY is a non-empty subset of a metric space X = 
{X,d), and f : Y is a Lipschitz function then there exists a Lipschitz extension 

/ext : X -> R o// from Y to X with ||/ext||Lip = ll/lkip- Similarly, if f : Y ^ C is 
Lipschitz then there exists an extension /ext : X — > C with || /ext 1 1 Lip ^ 2||/||Lip. 



Proof. If / is real-valued one can for instance define fext{x) '■= min(inf{/(y) + M(i(x, y) : 
y G F}, supj^gy /(y)), where M := ||/||Lip- The complex case then follows by splitting 
/ into real and imaginary parts. □ 

Lemma A. 9 (Fourier approximation of Lipschitz functions). Let (R/Z)'^ be the standard 
d-dimensional torus, with metric induced by the l°° norm 

\\{xi,. . . ,Xd)\\(K/z)d ■= sup IIxjIIr/z. (A. 3) 

Let Y be a subset of (R/Z)*^, and let f : Y —>■ C be a Lipschitz function bounded in 
magnitude by 1. Then for any N ^ 1 there exist J = Od{N'^), ci, . . . ,cj = 0(1), and 
mi, . . . , mj G Z'^ such that 
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for all X & Y. Furthermore, the values of mi, . . . ,mj depend on L, d, N but are 
otherwise independent of f orY. 



Proof By Lemma |Aj we may take Y = (R/Zy. Let ajv : {R/ZY R+ be the Fejer 
kernel 



Note that 

d I I 

^N{m) = n ~ -Jf-)Mm,KN 

i=i 

for all m G Z'^. We have 

/ * ^^{x) = / * crAr(m)e(m ■ x) = f {rn)d is[{rn)e{m ■ x) 

m m 

which, since ||/||oo = 0(1), has the form Cje(mj ■ x) where J = Od{N'^) and 

Cj = 0(1). To conclude the proof of the lemma, then, it suffices to show that || / — / * 
(^nWoo = Od{\\f\\up^ogN/N). To this end, note that 



/ 



\f{x) - f * aN{x)\ = I / (fix) - f{y))aN{x - y) dy\, 
and hence by the change of variables z := x — y it will suffice to show that 

\\z\\(^^/^),aNiz) dz = OdihgN/N). 

Since has total mass one, the portion of the integral on the region ||z||(-]R/2-)d ^ A^~-^ 
is acceptable. Now, for each integer n ^ 0, consider the portion of the integral on the 
annular region 2"A^~^ ^ ^ 2'^~^^N~^. We have 

/ \\z\\^R/^y(TN{z)dz\ < 2"iV-^ / \crN{t)\dt 

i||tl|L/z»2"7V-i N Sm (TTti) 



-^liti||H/z>2"Af-l l|f^l||R/Z 

Summing this over n = 0,l,...,iVwe obtain the claim. □ 

We shall adopt the following convenient notation from [I3]: we use h(xi, . . . ,Xk) to 
denote any function of the variables xi, . . . ,Xk which is bounded by 0(1); the exact value 
of b() may vary from line to line, just as with the 0() notation. We use this notation 
to denote functions whose exact value is not of interest to us, invariably because they 
are destined to be annihilated in the course of a Cauchy-Schwarz argument such as the 
following. 
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Lemma A. 10 (Cauchy-Schwarz inequality). Let X,Y be finite non-empty sets, and let 
f : X X Y ^ C be a function. Then 



|E^6xIEj/eyb(a;)/(a;,?/)| < \E^(.x'^y,y'&Yfix,y)f{x,y'] 



1 1/2 



ijd 



and 

\E,,^x^y^Yh{x)h{y)f{x,y)\ < |E,. ,,gxE,y6y/(x, y)f{x, y')f{x', y)f{x', y')\'/\ 
Similarly, if K : X"^ — > C ^s a function, then 

\^xuX2,X3,X4£xH^2, X3, X4)b(xi, X3, X4)b(xi, X2, X4)b(xi, X2, X3)K{xi,X2, X3, X4,)\ 

^ If ,^ TT z^f™,, . ™^ . 

n,«2,j3,«4e{0,i} 
where C : z ^ 'z is the conjugation operator. 

Remark. These estimates are part of the theory of the Gowers uniformity norms 
and \\K\\a<i; see for instance [II [TOl [IH [El [ISl EI] • 

Proof. From the triangle inequality and Cauchy-Schwarz we have 

|E,.exE,eyb(x)/(a;,y)| <E,ex|E,ey/(x,y)| ^ (E,.ex|E,ey/(x, y)^)!/^ 

and the first claim follows. The second claim follows by two iterations of the first, and 
the third follows from four iterations of the first. □ 

Now, we develop some quadratic analogues to the linear phase estimates given above. 
We begin with a quadratic counterpart to (1A.1I) . We do not pretend that the exponents 
here are even remotely optimal; we have opted for a statement which is conveniently 
derived from our earlier lemmas. 

Lemma A. 11 (Weyl's inequality). Let a,/3, 7 G M and let S E (0, 1). Let I C Z be a 

discrete interval such that \I\ ^ 2^^/ 6^ and 

|E/6/e(a/^ + /3/ + 7)| ^ 5. 

Then we have 

243 

||q;||r/z,2125- 



514|J|2 



Proof. By translating I we may take / = {1, . . . , L} for some L. Squaring the expression 
gives a double sum over variables I' , l; setting I' = I + h, we find that 

L min(L—h,L) 
h=—L i=max(l— 



Summing the inner geometric series using (lA.ip we see that 



,\2ah\ 

h=-L ' 
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and therefore that 

L 



y mill (L, -—^^ ) ^ 52^78. 



h=l 

It follows that there are at least 6^L/16 values of /i G {1, . . . , L} such that ||2a/;.||]8/z ^ 
16/6'^L. The claim then follows from Lemma [A.4( ii). □ 

One can now repeat the proof of Lemma [A.4( i). using Lemma [A. Ill in place of ( lA.ip . 
to conclude 

Lemma A. 12 (Recurrent quadratics are non-diophantine) . Let I (1 Z be a discrete 
interval, let a,/?, 7 be real numbers, and suppose that the set 

{lei : ||a/2 + /3/ + 7lk/z ^ Si} 

has cardinality at least S2\I\ for some < 61,62 < I with 61 ^ ^62. If \I\ ^ 2^^5^^^, 
then we have 

^ ol41r-28| r|-2 

I|"IIr/Z,2435-9 ^ ^ f'2 K I ■ 



The final tool we assemble in this appendix is a technical lemma used in ^ This allows 
us to approximate a Lipschitz function F by a "soft-thresholded" function F. 

Lemma A. 13 (Soft-thresholding a Lipschitz function). Let F : X ^ [—1, 1] be any 
Lipschitz function on a metric space {X, d) , and let 6 > be a parameter. Then there 
is a Lipschitz function F : X [—1,1] satisfying the following properties: 



(i) \\F\\u, ^ llFilLip; 

(ii) If X G Supp(F) and d{x,x') ^ 6 then x' G Supp(F); 

(iii) ||F-F|U^5||F||Lip. 



Proof. We will set 

F{x) := max(|F(x)| - A, 0) sgn(F(a;)) 

for an appropriate value of A ^ which we shall shortly specify. Let us first prove that 
any such function satisfies (i). Since |F| is pointwise bounded by it suffices to show 
that if a;, a;' G X then 

\F{x)-F{x')\ ^ \F{x) - F{x')\. 

But this follows because the function x ^— max(|x| — A, 0) sgn(x) is easily seen to be a 
contraction. This proves (i). 

Now set A := 5||F||Lip. Statement (iii) is then obvious. To prove (ii), note that if 
X G Supp(F) then |-F(a;)| > A. Thus if d{x,x') ^ 6 then 

|F(a;')| ^ \F{x)\ - \F{x) - F{x')\ ^ \F{x)\ - 6\\F\\up > 0. □ 
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The purpose of this appendix if to give the proof of Proposition 12. 3^ the statement of 
which we recall now. 

Proposition 12.31 (2-step nilsequences are averages of twisted 1-step nilsequences) . Let 
G/T he a 2-step nilmanifold and let < e < 1/2. Let F : G/T C be a bounded 
Lipschitz function with ||-F||Lip ^ 1; (^nd let g & G and x G G/T be arbitrary. Then 
there exists a 1-step nilmanifold G/T depending only on G/T and a decomposition 

F{T^x) = E,^jwMT'g:x,)e{-Mn)) + 0{e) 

where 



• I is a finite index set; 

• For each i E I the Wi are complex numbers with Ejg/|tUj| <^ ^-Oc/rCi)^- 

• Fi : G/T ^ C is bounded Oc/ri^) -Lipschitz; 

• gi E G; 

• Xi e G/t; 

• (pi : Bi ^ M/Z is a phase function which is locally quadratic on the generalized 
Bohr set B, := {n G [N] : Fi{T^.Xi) 0}. 



As we remarked in ^ we are going to give a rather hands-on calculational approach 
to this theorem, using Mal'cev bases and the Heisenberg nilmanifold as an illustrative 
example. The reader interested in a comprehensive discussion of Mal'cev bases may 
consult the book [6]. 

Let G be a connected, simply connected, 2-step nilpotent Lie group. Thus G is a Lie 
group, and the central series Gq = Gi = G, G2 := [G, Gi], G3 := [G, G2] terminates at 
the third step, so that G3 = {e}. Let F be a discrete, cocompact subgroup of G. 

The Heisenberg example. To motivate our arguments, let us first prove the above 
Proposition in the model case of the Heisenberg nilmanifold G/T, with 



and 



Clearly Gi = G and 



G := I ( Y X2 I : Xi, X2, X3 G M| 
Vo 1 / 

r := M 1 m2 : mi^mo.m-i G Zk 
vo 1 / 



G2 := [G,Gi] = {(Sn) :tGM} 



and G3 := [G, G2] = {/}. 
Let us distinguish elements 



110\ /100\ /lOl 



ei = 1 , 62 = 1 1 , 63 = 1 

Vool/' Vool/' Vool 
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To these are associated the one-parameter subgroups {eDt^M.' 

= 1 , e? = 1 12 , e? = 1 . 

Note that 

The collection {61,62,63} is an example of a Mal'cev basis for G which respects F, the 
key feature to note being that T is precisely the set {6™^ 6™^ 6™^ : mi,m2,ms G Z}. 

For Mal'cev coordinates to be of any use, we need to know how the group operation in 
G interacts with them. It is easy to explore this for the Heisenberg nilmanifold. Every 
element x = 61^63^63^ e G may be written in Mal'cev coordinates as (^1,^2,^3)11- It is a 
simple matter to check that multiplication in G is given by the rule 

{tl, h, ^3)11 * {Ul, U2, U-i)ii = {ti + Ui, t2 + M2, h + U^- t2Ui)ii. (B.l) 

A trivial induction confirms that if g' = (ai, q;2, Q;3)n then 

— (nai, nQ;2, na^ — \n{n — 1)q;iq;2)ii, (B-2) 

an expression which provides the first indication that 2-step nilmanifolds are somehow 
associated with "quadratic" types of behaviour. 

To coordinatize the nilmanifold G/F, we pick a fundamental domain for the action of 
F on G. A very natural one is 

T := {(xi, X2, 0:3)11 : -| < xi, xs, 0:3 < \}. 

If a; = (xi, X2, ^3)11 e G, then we write 7-^ for the unique element of F such that x^^ G ^■ 
We have 

Ix = {-[Xl], -[X2], -[X3 - [Xl]x2])u, 

where [u] = u — {u} denotes the nearest integer function (fractional parts are taken to 
have values in (— |, |)). Defining 

r{x) = X-f:c, 

we therefore have 

r(x) = {{xi}, {X2}, {x3 - [a;i]a;2})ii. 
For any element x we have that x and t{x) are equivalent under the action of F on G. 

We may now analyse the map Tg : G/F G/F. Recall that ii t/j : G ^ G /V is the 
canonical projection then the transformation Tg : G/T — >■ G/F is defined via the rule 
Tg{%lj{x)) = ijj{gx). Persisting with the notation g = {ai, a2, cx3)ii and using coordinates 
on the fundamental domain JF to represent G/F, we have 

T;(0) = r(/'0) 

= {{nai}, {na2}, {na^ — |n(n — 1)q;iq;2 — [^ai]na2})ii 

= {nai,na2,na3 — \n{n — 1)ci;iq;2 — [nai\na2)ii (mod 1). (B.3) 
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This provides the first indication that nilmanifolds encode behaviour somewhat more 
general than simply quadratic; here we have "generalised" quadratic behaviour typi- 
fied by the appearance of the "bracket quadratic" [nai]na2. We have now assembled 
everything we need to prove Proposition 12.31 for the Heisenberg nilmanifold. 



Proof of Proposition \2.3\ for the Heisenbera nilmanifold. Let F{TgX) be a nilsequence 
on G/T . For the sake of exposition we take x = so that ( IB. 31) applies. Let n : G ^ 
G/G2 be the canonical projection and, by abuse of notation, write vr : G/T G/VG2 
for the induced projection. Now G/TG2 is a 1-step nilmanifold, being the quotient of 
G/G2 by F/r n G2, and we may identify it with (M/Z)^ via the coordinatization 

7r((ti,t2,i3)ii) = (^1,^2)- 
Observe that (7r(T^0))„eN = (^"(c,)0)neN is an orbit on G/TG2, generated by the rotation 
Tnig) '■ (^1,^2) {ti + ai,t2 + 0:2) on the torus. Let 

d 



1=1 



= 0(1), be a Lipschitz partition of unity on (R/Z)^ with the property that for each / 
there are Xi,X2 such that 

Supp(?/'i) = [xi,xi + 3^] X [a:2,a;2 + 

Then we have 

d 

1=1 

We will look at each constituent nilsequence 4'i(T^(g)0)F(TgO), and write it in terms of 
local quadratics on 1-step Bohr sets defined on G/VG2- 

Fix I, 1 ^ I ^ d together with the associated Xi and X2- Now the set U := {x E G /T : 
7r(s) G [a;i, Xi + j^] x [x2, ^2 + jq]} is diffeomorphic to the direct product 

[xi,Xi + X [X2, X2 + yq]x M/Z, 

which itself is diffeomorphic to a subset of (M/Z)^. Write tt^ : U ^ M/Z for projection 
onto the third coordinate. Write S for the set of all n G N such that T^O G U. Note 
that 5* is a 1-step Bohr set, since 

Lemma B.l (Local quadratic behaviour). Suppose that n,hi,h2 and are such that 
all eight of the points n + eihi + 62/^2 + £3^3; ei,e2,e3 € {0, 1}, lie in S. Then the 
TT^- coordinates are subject to the quadratic constraint 

^ ^ l^_'^yi+<i2+<i'ij^^lrpn+eihi+e2h2+e3h3Q'j _ g 
ei,e2,e3e{0,l} 



Proof Recall (lR3!l . Writing 

fi{n) := na^ — ^n{n — l)aia2 — [nai]nQ;2, 
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we are to show that 

{-ly^^'^^'fiin + + e2h2 + e^h^) = 

whenever the n + eihi + e2/i2 + £3^3 are all in 5*. We may write /i as the sum of a 
quadratic polynomial and f2{n) := {nai}na2. It suffices, then, to verify the result for 
this function /2 instead. To do this, we note that the obvious relations 

{{eihi + e2h2 + e-ihs)ai} = {{n + eihi + £2/^2 + £3^3)"!} - {r;.ai}(mod 1) 

are actually equalities in M, and not just in M/Z, by virtue of the constraint that all 
quantities {{n + eihi + £2/^2 + e3^3)cn} he in the interval [xi, Xi + j^]. Furthermore we 
have such relations as 

{hiai} + {h2ai} = {{hi + /i2)ai}- 
By employing these together with a few simple manipulations, the lemma follows. □ 

To introduce locally quadratic exponentials, we use Lemma IA.9I to approximate F = 
F{ui, U2iU^)i considered as a function on f/ C (M/Z)^, by a sum of exponentials. For any 
e we may pick J = 0(e~^ log^(l/e)) together with complex numbers Ci,...,cj = 0(1) 
and frequencies mi, . . . , mj G Z^ so that 

J 

F{ui, U2-, U3) = ^ Cje{mj ■ u) + 0(e) 

for all u = (mi,M2,M3) G U. Using (IB. 31) we obtain the formula 

J 

F{T^O) = J2 Cje{mf\nai} + mf {na2} + mf 7r3(T;0)) + 0(e). 
i=i 

Each function e(m^-^''{nai} + ■m^^\na2}) is a Lipschitz nilsequence on G/TG2-, that is 
to say it can be written in the form fk{T^(^g'^0) ■ Thus we can write 

J 

MATp))F{T^O) = 5^/,(r;(^)0)e(mf 7r3(T„^0)) + 0(e). 

i=i 

By Lemma IB. 11 each of the constituents here is a local quadratic on a 1-step Bohr 
set. This concludes the proof of Proposition 12.31 in the special case of the Heisenberg 
nilmanifold. □ 

The GENERAL CASE. The above arguments can be can be extended to more general 
nilpotent groups. To do so, we need to involve the Lie algebra g associated to G together 
with the exponential map 

exp : Q ^ G. 

For the Heisenberg nilmanifold g may be identified with the Lie algebra of strictly upper 
triangular 3x3 matrices over M with O's on the diagonal, that is to say 

= I (^0 M2 j : Ui,U2,U3 e Kj . 
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The exponential map is given by matrix exponentiation, so exp(X) = e^, which in 
practice means that if 

/ ui U3 ■ 
X = U2 
VO , 

then 



exp(X) 



1 Ul tl3 + |«l'!i2 
1 U2 

1 



Wth the notation of Lie algebras and the exponential map it is possible to define, for a 
connected, simply-connected, nilpotent Lie group G, the 1-parameter subgroup {g^)t£R 
associated to an element g E G. Thus we set 

exp(X)* := exp(tX), 

for all X G and t G M. 

We can now obtain Mal'cev coordinates for any nilmanifold arising from a connected 
and simply connected Lie group: 

Proposition B.2 (Mal'cev coordinates of the second kind). Let G be a connected and 
simply connected s-step nilpotent Lie group with central series 

G = Go = Gi D G2 ^ Gs ^ ■ ■ • 3 G,+i = {e}. 

Let r be a discrete, cocompact subgroup of G. Then there is a collection 

{ei, . . . , 6j-^, ej-^_|_i, . . . , 6j2, 6,2+1, • • • , Cj^} 

such that 



(i) Suppose that j G {1, . . . , s + 1}, and define io := 1. Then every element of Gj 
can be wr 

(ii) We have 



can be written uniquely as e^^^l . . . e*^'''^, for real numbers tj^+i, . . . , tg+i- 



It turns out to be more natural to deal with coordinates of the first kind, which are 
defined on the Lie algebra g. Before defining these, we assemble some slightly disparate 
facts about how the exponential map provides a link between g and G in the nilpotent 
case. It is not particularly easy to find proofs of all of these statements in one place: 
our main resources were |1] and [6]. 

Proposition B.3 (Nilpotent Lie algebras and groups). Let G be a connected, simply 
connected, s-step nilpotent Lie group. Let g be the corresponding Lie algebra, and let 
exp : g ^ G be the exponential map. We have the following statements. 



(i) exp is a diffeomorphism between g and G, both of which are diffeomorphic to 
some Euclidean space M°'. 

(ii) Define the central series of g by go = 0i := and gj+i = [g,gi] for i ^ 1. Then 
exp(0j) = Gj. In particular, the Lie algebra g is s-step nilpotent. We have the 
relations [gi,gj] C g^^j and [Gi,Gj] C Gi+j. 
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(iii) (Baker-Campbell-Hausdorff Formula) We have 

exp(X)exp(F) = exp(Z), 

where 

Z = X + Y + ^[X,Y] + ^[X, [X,Y]] + ^[Y,[Y,X]] + ... 

Remarks. The dots in (iii) are supposed to indicate that the Baker-Campbell-Hausdorff 
formula has terms involving commutators of fourth and higher order. Note, however, 
that since q is nilpotent, the series does terminate. It is possible to give a description 
of the whole series, though it does not have a particularly simple closed form. See [1]. 



We describe now the Mal'cev coordinates of the first kind: 

Theorem B.4 (Mal'cev coordinates of the first kind). Let G be a connected, simply- 
connected, nilpotent Lie group with Mal'cev basis {ei, . . . , 6^}. Thus any element g E G 
may be written uniquely as e\^ . . . e^ , giving rise to the Mal'cev coordinates of the second 
kind {ti, . . . ,tk)ii- Write Cj = exp(Xj), where Xi G g. Then for any g E G there are 
unique i^i, . . . , e M such that g = exp(^iXi -|- ■ ■ ■ -|- ^kXk)- We refer to the elements 
of the k-tuple (^i, . . . , as the Mal'cev coordinates of the first kind. □ 

Remark. In view of Proposition IB. 21 (i) and Proposition IB. 31 (ii), we have 

Qj = SpaniR(Xi^,+i,...,XiJ. 

For the Heisenberg nilmanifold, note that (^1,^2,^3)11 = (^1,^2,^3 + ^tit2)i. 

Writing t : M.^ ^ G for the map which identifies coordinates of the first kind with the 
element in G they represent, we see that r^^(r) is not a lattice. Fortunately, something 
nearly as good is true. 

Proposition B.5 (Fundamental domain description of a nilmanifold). [T, Ch IV.6]. 

Let G/T be a nilmanifold, and suppose that Xi, . . . is a Mal'cev basis of the first 
kind in q. Let r : (^1, . . . , ^k)l ^ exp(^iXi + ■ ■ ■ -|- ^kXk) he the coordinate map, and let 
T be any region of the form 

{(^1, . . . : tti ^ < tti + 1 for all i}. 

Then each point of G is equivalent, under the right action ofT, to precisely one point in 
exp(^). Furthermore the natural projection map tt : G — >■ G/T is continuous onexp(jF) 
and is a homeomorphism when restricted to the interior exp(jF)°. 



Our aim now is to describe the action of some g = . . . , l3k)i on G/T by finding 
formula analogous to (IB. II) . (IB. 21) and ( IB. 31) . The key tool is the Baker-Campbell- 
Hausdorff formula. For notational simplicity we restrict to the 2-step case from now 
on, and write m := ^2 and n := ^3. Thus the Mal'cev basis of the first kind for G is 
{Xi, . . . , Xm, Xm+i, • • • , Xn}, wherc 

Span,K(X„+i, . . . , Xn) = 02 = [0, 0]- 
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The Lie algebra g is completely specified by its structure constants, a collection of real 
numbers {aijk)i^i,j^m,m+is^k^n such that 

n 

[X„X^] = J2 ('^JkXk. (B.4) 

k=m+l 

These constants can be arbitrary so long as {aijk)i,j^m is antisymmetric for each k, 
though if we want G to possess a cocompact subgroup F then certain rationality condi- 
tions must hold ^18j . 

Lemma B.6 (Multiplication in coordinates of the first kind). Suppose that G is a 
connected and simply- connected 2-step nilpotent Lie group with group operation *, and 
abuse notation by identifying elements ofG with their coordinates of the first kind. Then 
we have 

where the C,<^m '■= (Ci, • • • , Cm), '■= i^i, ■ ■ ■ , ^m) O'^d the (pj are antisymmetric bilinear 
forms. 



Proof. This is a simple matter of combining the Baker- Campbell-Hausdorff formula with 
the existence of structure constants (]B.4I) . We remark that the presentation of a 2-step 



nilmanifold in this form is essentially the same as an example discussed by Furstenberg 
in [9]. 

Observe in particular that 

g^ = {n(3,,...,n(3r.)i, (B.5) 

and thus 

TgX = {n(3i + xi, . . . , n(3m + x^, n(3[ + x^+i, • • • , n(3'^ + Xn)i (B.6) 
for certain constants (3'^ depending on g.,x and the bilinear forms 0j. 

To coordinatize G/T we pick, in view of Proposition IB. 51 the very natural fundamental 
domain 

If X = (xi,...,x„)i G G, then we write for the unique element of F such that 
X7a; G Write r(x) = x'^x- We need a formula for 7^; in terms of coordinates of the 
first kind, and to obtain such a result we need a description of the lattice F in terms 
of these coordinates. Since F may be identified with in coordinates of the second 
kind, such a description can be obtained by finding the relation between the two types 
of coordinate. Such a relation is easy to obtain. Indeed by definition we have 

(tl, . . . , tn)ll = (tl, 0, . . . , 0)1 * ■ ■ ■ * (0, . . . , 0, tn)l. 

By inductive use of Lemma [B. 61 this quickly implies that 

(tl, . . . , tn)ll = {tl, . . . , tm, Qm+litf^m), ■ ■ ■ , Qn{t^m))l (B-7) 

for certain quadratic forms qj. In fact these forms are rather related to the alternating 
forms ipj] if ip{x,y) = Y.k,Km^kiXiyk then q{x) = Y.k<i'^kiXkXi. 
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In terms of coordinates of the first kind, then, we see that 

r = {(ri, . . .,rm,rm+i + qm+i{r<;rn), . . . ,r„ + qn{r<^rn) : ri, . . . ,r„ e Z}. 
It follows that 

. . . , [Xn (Pn 

and that 

t{x) = ({Xi}, . . . , {Xm},{Xm+l - 0m+l(x^m, NsCm) + (lm+l{[x]<^m)} , 

...,{Xn- (pniXs^m, N^m) + ( N ^m) 

We remark that we have essentially provided an independent confirmation of Proposition 
IB.SI for 2-step nilmanifolds. The proof in the s-step case merely involves more notation. 

Combining this with (IB. 60 leads to the analogue of (IB. 30 : 

TgX = {nPi, . . . ,nPrn,'ipm+i{n), . . .,iJn{n))i (mod 1), 
where each ipj has the form 

m 

V'(n) = an + b + ^ Cin[n(3i] + ^ cik{n(3i]{n(3k}. 

The remainder of the proof of Proposition 12.31 is, from this point, almost identical to 
the special case of the Heisenberg nilmanifold. We leave the details to the reader. □ 



Appendix C. Divisgr moment estimates 



We collect some standard moment estimates for the divisor function T{n) := 

These are used to prove Proposition IC.21 which is used in ^ ITT] to show that there are 

not too many "collisions" occuring in sets such as {dw : D < d ^ 2D; W < w ^ 2W}. 

The basic estimate we need is 

Lemma C.l. Let m,N ^ 1 be integers. Then we have the moment estimate 



Proof. This is very standard: see, for example, [5] or [2^. For our application, the 
precise value of exponent 2™ — 1 does not need to be attained; any bound of the form 
log'^'" N would suffice. □ 

In particular, we have the second moment estimate 

E„6[^]r(n)2<log3Ar 
which by dyadic decomposition then implies 

J2'-^«log^N. (C.l) 

ne[N] 
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Now if y4 C {1, . . . , A^} is a nonempty set of size aN and m ^ 2 is an integer, then from 
Holder's inequality we have 

^ a-2/-(E„g[^]r(n)™)2/-) 
«^ a-2/-(logiV)2(2™-i)/-. 

In particular, for any k < 1/2 we have the moment estimate 

E„eAr(n)2 a-^ log^"'" N. (C.2) 
This estimate has the following consequence. 

Lemma C.2 (Divisor packing lemma). Let A C {1, . . . , A^} he a non-empty set of size 
aN, and for each d ^ 1 let := {n & A : d\n] denote those elements of A which are 
multiples of d. Suppose T) G Z'^ is a finite set of positive integers such that 

\Ad\ ^ S\A\ 

for all d E D and some 6 > 0. Then for any positive k, < 1/2 we have 

||J^<^I». 6'm'\A\a-\og-''''' N. 



Proof. From hypothesis we have 



By Cauchy-Schwarz we conclude that 



From the trivial bound 



and flC.2p we thus have 



rfeS d\n 



-a-'^log' A^>«(5'|D| 



1^1 

and the claim follows. □ 
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